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Abstract 

Air  Mobility  Command’s  (AMC)  airlift  assets  that  transit  airfields  in  Afghanistan 
are  given  only  a  small  variety  of  ground  times  in  order  to  aecomplish  their  mission.  These 
ground  times  are  based  on  overarehing  eategories  of  missions  that  aircraft  execute,  such 
as  cargo  upload,  cargo  download,  passenger  upload,  passenger  download,  or  a 
combination  of  these.  The  current  mission  planning  system  uses  these  overarching 
categories  to  plan  ground  times  and  does  not  account  for  the  exact  amount  of  cargo  or 
passengers.  This  leads  to  longer  or  shorter  ground  times  than  planned.  In  order  to  increase 
stability  at  these  fields  and  better  account  for  the  number  of  sorties  that  can  be  planned 
into  Afghanistan,  a  method  to  calculate  optimal  or  near  optimal  ground  times  is  needed. 

This  research  creates  a  linear  regression  model  that  accounts  for  the  size  of  cargo 
upload,  cargo  download,  passenger  upload,  and  passenger  download  known  by  the 
mission  planner.  This  model  can  be  used  by  the  mission  planners  at  AMC’s  Tanker 
Airlift  Control  Center  (TACC)  to  increase  the  efficiency  of  planning  sorties  into 
Afghanistan.  Six  months  of  historical  data  is  filtered  and  categorized  and  then  analysis  is 
accomplished  using  the  JMP  linear  regression  program.  Eight  scenarios  are  analyzed  to 
account  for  C-17,  C-130  and  C-5  missions  to  Bagram  AB,  Kandahar  AB  and  Camp 
Bastion  airfields  in  Afghanistan.  Analysis  is  concluded  and  insights  are  drawn  regarding 
how  to  stabilize  planned  ground  times. 

Three  of  the  scenario  models  are  found  to  be  significant  and  are  validated  with 
split  data  from  a  separate  month’s  worth  of  data.  All  C-130  models  are  not  significant  due 
to  many  factors.  The  remaining  insignificant  models  can  be  attributed  to  data  system 
errors  and  unexplained  variance.  The  use  of  the  three  significant  models  will  increase 
stability  in  AMC  planning  and  efficiency.  In  turn,  our  overall  wartime  effectiveness  will 
be  enhanced. 
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OPTIMIZING  GROUND  TIMES  FOR  AMC  AIRCRAFT  IN  AFGHANISTAN 


1.  Introduction 


1.1.  Background 

Air  Mobility  Command’s  (AMC)  mission  statement  is  “Provide  Global  Air 
Mobility  ...  Right  Effeets,  Right  Plaee,  Right  Time”  (www.ame.af.mil).  Sinee  the 
beginning  of  Operation  Enduring  Freedom,  AMC  has  tried  to  aeeomplish  this  and  has 
airlifted  “approximately  12.5  million  passengers,  delivered  more  than  4.5  million  tons  of 
eargo,  distributed  more  than  1.5  billion  gallons  of  fuel,  and  performed  nearly  133,000 
patient  movements”  (Wilson).  AMC  aeeomplishes  its  airlift  prowess  with  aireraft  sueh  as 
the  C-17,  C-5,  C-130,  KC-10  and  KC-135.  These  aireraft  have  speeifie  missions.  The  C- 
17,  C-5  and  C-130’s  primary  missions  are  to  deliver  eargo,  in  many  different  forms,  to 
areas  around  the  World.  The  KC-135  and  KC-lO’s  primary  mission  is  air  refueling  with  a 
seeondary  mission  of  moving  eargo. 

This  researeh  foeuses  on  the  eargo  delivered  into  Afghanistan.  Since  the  bulk  of 
the  cargo  is  delivered  from  the  C-17,  C-5  and  the  C-130,  the  focus  is  on  these  aircraft. 

The  C-17  can  carry  102  troops/paratroops  (188  troops  with  palletized  seating),  36  litter 
and  54  ambulatory  patients  and  attendants,  170,900  pounds  of  cargo  with  up  to  18  pallets 
positions  and  can  fly  between  2,400-6,000  nautical  miles  (dependent  on  cargo  weight) 
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without  air  refueling.  The  C-5  can  carry  73  passengers,  270,000  pounds  of  cargo  with  up 
to  36  pallet  positions  and  can  fly  up  to  6,320  nautical  miles  (dependent  on  cargo  weight) 
without  air  refueling.  Both  aircraft  have  a  virtually  unlimited  range  when  utilizing  in¬ 
flight  refueling.  The  C-130  can  carry  40,000  pounds  of  cargo  with  6-8  pallets  or  74-97 
litters  or  16-24  CDS  bundles  or  92-128  combat  troops  or  64-92  paratroopers,  or  a 
combination  of  any  of  these  up  to  the  cargo  compartment  capacity  or  maximum 
allowable  weight  and  can  fly  1200-2000  nautical  miles  (dependent  on  cargo  weight) 
(AFPAMlO-1403,  Air  Force  Aircraft  Fact  Sheets). 

All  of  these  aircraft  have  operational  restrictions  in  order  to  land  at  certain 
airfields.  These  restrictions  are  mainly  determined  by  the  size  of  the  available  runway  and 
if  the  runway  is  stressed  for  a  specific  type  of  aircraft.  These  restrictions  keep  the  C-5  out 
of  many  airfields  in  Afghanistan.  It  only  lands  at  Bagram,  Kandahar,  and  Kabul  airfields. 
The  C-17  can  land  on  more  runways,  due  it  its  smaller  size  and  capability  to  land  on 
unprepared  surfaces.  The  C-130  is  the  most  versatile  of  the  three  cargo  aircraft  and  can 
land  at  almost  any  airfield  in  Afghanistan. 

1.2,  Problem  Statement 

Air  Mobility  Command’s  (AMC)  airlift  assets,  that  transit  airfields  in 
Afghanistan,  are  given  only  a  small  variety  of  ground  times  (slot  times)  in  order  to 
accomplish  their  mission.  These  are  based  on  what  overarching  type  of  mission  the 
aircraft  are  executing,  i.e.  cargo  upload,  cargo  download,  passenger  upload,  passenger 
download,  refueling,  or  a  combination  of  these  missions.  The  current  mission  planning 
system  uses  these  overarching  categories  to  plan  ground  times  and  does  not  account  for 


2 


how  much  cargo  or  how  many  passengers  are  to  be  loaded  or  unloaded.  This  ean  be  seen 
in  Table  1 . 1 .  If  an  aircraft  has  only  a  download  or  an  upload,  then  the  ground  time  is 
shorter,  i.e.  1+45  for  the  C-17  with  one  event  (one  hour  and  forty  five  minutes).  If  the 
aireraft  has  both  a  download  and  an  upload,  then  the  ground  time  is  inereased,  i.e.  3+15 
for  the  C-5  with  two  events.  These  numbers  were  reeeived  from  eurrent  Tanker  Airlift 
Control  Center  mission  planners  and  Air  Foree  Pamphlet  10-1403. 


Table  1.1  Event  Planning  Ground  Time 


Acft 

1  EVENT  (hours) 

2  EVENTS  (hours) 

C-17 

1+45 

2+15 

C-130 

0+45 

1+30 

C-5 

2+15 

3+15 

The  use  of  these  generie  times  leads  to  mueh  longer  or  shorter  ground  times  than 
planned.  In  order  to  stabilize  airflow  at  these  fields  and  better  aeeount  for  the  number  of 
sorties  that  ean  be  planned  into  Afghanistan,  a  method  to  ealeulate  optimal  or  near 
optimal  slot  times  is  needed.  This  should  inerease  the  effieieney  of  how  troops  and  eargo 
are  delivered  to  downrange  locations.  In  turn,  our  overall  wartime  operations  will  be 
enhaneed. 


1,3,  Methodology 

A  retrospeetive  study  is  aceomplished  to  address  the  problem  statement.  This 
ineludes  historical  data  synchronization  and  linear  regression  methods.  These  are  used  to 
build  a  suitable  model  for  AMC  to  use  and  more  reliably  prediet  ground  times.  Data 
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synchronization  is  used  to  merge  database  information  into  a  usable  format.  This  format 
is  then  used  in  linear  regression  software  (JMP)  to  develop  models  to  fit  different 
seenarios.  These  seenarios  inelude  eaeh  airfield  and  eaeh  jet  individually;  therefore,  there 
are  eight  models  developed  based  on  eaeh  seenario. 

Table  1,2  Scenario  Matrix 


Bagram  Air  Base 

Kandahar  Air  Base 

Camp  Bastion 

C-17 

Seenario  1 

Seenario  2 

Seenario  3 

C-130 

Seenario  4 

Scenario  5 

Scenario  6 

C-5 

Seenario  7 

Scenario  8 

C-5s  do  not  transit 

The  different  models  are  needed  beeause  eaeh  jet  has  different  ramps,  parking  spaees, 
aerial  port  eapabilities  and  other  variables  per  airfield.  These  other  variables  are  set  to 
remain  eonstant  at  these  fields. 

The  linear  regression  analysis  eheeks  for  model  adequaeies,  signifieance, 
multieollinearity,  infiuenee  points,  outliers,  and  other  faetors  sueh  as  VIF,  Cp,  and 
adjusted  that  are  defined  in  Chapter  2.  This  should  satisfy  the  need  to  understand  if  the 
historieal  information  is  usable  in  the  regression  model.  Following  these  tests,  and  based 
on  the  regression  ooeffieients;  a  regression  equation  is  computed  and  used  to  predict  how 
long  jets  should  be  planned  to  stay  on  the  ground  per  eargo  and  passenger  loads.  The 
eargo  is  aeeounted  for  in  pallet  positions  and  the  passengers  are  eounted  individually. 

1,4,  Assumptions/Limitations 

There  are  many  assumptions  and  limitations  that  eould  impaet  this  analysis  and 
mission  sueeess.  These  lie  within  the  aireraft,  the  airerew  operating  the  aireraft,  the  Aerial 
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Port  crews  (crews  that  upload  and  download  cargo),  the  airspace  over  and  on  the  way  to 
the  airfield,  and  the  airfield  itself.  These  are  standard  assumptions  that  are  sometimes 
taken  for  granted,  but  could  severely  impact  mission  success. 

The  assumptions  are; 

•  Aircrew  operating  the  same  type  of  aircraft  have  the  same  abilities  to 
download  all  different  types  of  cargo 

•  No  engine  running  offloads  or  onloads  are  accomplished  for  C-17  or  C-5 
aircraft 

•  Aerial  ports  have  the  needed  equipment  to  download  and  upload  all  types 
of  cargo  from  each  specific  aircraft  sent  to  its  airfield 

•  Aerial  port  members  have  the  same  ability  to  download  and  upload  cargo 

•  Airspace  is  open  over  and  leading  to  the  specific  landing  runway 

•  Runways  are  open  without  major  implications  to  inbound  or  outbound 
aircraft,  i.e.  the  runway  is  open  and  the  taxiways  to  the  parking  spots  are 
usable 

•  Weather  conditions  at  the  airport  during  landing  windows  satisfy  basic  Air 
Force  Instruction  1 1-202V3  weather  requirements 

•  Time  to  transit  from  landing  to  parking  is  always  constant  per  airfield 

•  Time  to  transit  from  parking  to  takeoff  is  always  constant  per  airfield 

•  Parking  spots  per  aircraft  are  constant 

•  Crew  planning  and  inspection  times  are  constant 

•  Concurrent  servicing  of  cargo  and  fuel  is  approved 
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Limitations  impacting  this  study  mainly  come  from  acquiring  the  data  to  analyze  the 
ground  times  of  different  aireraft.  The  major  limitations  are  listed  below: 

•  Not  all  airerew  and  aerial  port  members  have  the  same  ability  to  download 
and  upload  equipment 

•  Data  systems  from  whieh  information  is  pulled,  i.e.  GDSSII  and  GATES 
are  not  perfeet  and  rely  on  Airmen  to  input  data  eorreetly 

o  Data  is  not  kept  for  all  different  types  of  delays  on  the  ground 
o  Delay  eodes  are  eonsidered  inaecurate  within  GDSSII 
o  Changes  to  the  seheduled  ground  times  are  made  within  GDSSII 
while  the  mission  is  aetive,  the  schedule  ground  time  should 
remain  the  same  throughout  the  mission 
o  Numerous  data  points  (i.e.  mission  information)  have  more  pallet 
positions  or  passengers  than  are  possible  for  the  aireraft  to  hold 
o  Numerous  data  points  have  no  eargo  or  passenger  data 

•  No  database  known  at  this  time  keeps  traek  of  how  mueh  fuel  eaeh  jet  has 
uploaded  and  the  time  required  for  fueling 

1.5,  Research  Objectives 

The  objeetive  of  this  researeh  is  to  build  a  model  that  aeeounts  for  the  amount  of 
cargo  uploaded,  eargo  downloaded,  passengers  uploaded,  and  passengers  downloaded. 
Other  known  delays  such  as  a  Medical  Evacuation,  refueling,  and  any  other  known  length 
of  delays  are  added  separately  based  on  speeifie  mission  requirements  by  the  mission 
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planner.  This  model  is  instantiated  in  an  EXCEL  program  that  the  mission  planners  at 
AMC’s  Tanker  Airlift  Control  Center  (TACC)  ean  use  to  inerease  the  efficieney  of 
planning  sorties  into  Afghanistan. 

In  order  to  build  this  model,  data  is  needed  from  TACC.  This  data  eonsists  of  how 
long  it  takes  to  upload  and  download  certain  amounts  and  types  of  cargo  and  passengers. 
This  data  can  be  taken  from  two  different  systems,  Global  Decision  Support  System  2 
(GDSS  II)  and  Global  Air  Transportation  and  Execution  System  (GATES). 

GDSS  II  is  used  by  every  Air  Eorce  command  post  at  airfields  in  Afghanistan  and 
by  the  TACC.  Airmen  at  these  centers  input  data  including:  the  scheduled  and  actual 
land  times,  take  off  times,  and,  if  applicable,  reasons  for  delay.  GDSS  II  can  be  accessed 
by  most  Airmen  who  operate  in  the  AMC  environment  to  tell  when  incoming  planes  will 
be  landing,  how  much  cargo  they  have,  if  there  is  a  delay,  and  to  prepare  their  field  for 
the  incoming  aircraft.  This  is  an  essential  tool  for  Airmen  to  accomplish  their  jobs. 

GATES  is  a  system  used  by  Aerial  Port  members  to  track  passenger  and  cargo 
uploads  and  downloads.  This  system  records  cargo  movement  per  mission  identification 
numbers  into  and  out  of  individual  airports.  This  system  tracks  where  cargo  is  currently, 
where  it  came  from  and  where  it  is  going. 

The  GATES  system  is  used  to  pull  information  about  how  much  cargo  and  how 
many  passengers  were  downloaded  and  uploaded  onto  specific  aircraft  with  specific 
mission  numbers.  This  information  is  cross  referenced  with  information  from  GDSSII  of 
scheduled  and  actual  ground  times.  A  model  is  developed  based  on  this  information  and 
used  to  improve  the  ground  time  planning  system. 
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In  order  to  test  this  model,  six  months  of  historieal  data  are  run  through  the  model 


and  insight  is  drawn  as  to  how  much  the  ground  time  planning  has  changed  and  how 
many  more  or  less  aircraft  can  be  planned  downrange  in  a  given  day  and  month.  The 
model  based  on  six  months  of  historical  data  is  then  used  on  subsequent  historical  months 
to  see  if  the  model  predicted  ground  times  were  closer  to  the  actual  ground  times  than  the 
originally  scheduled  ground  times. 

1,6,  Summary 

Chapter  one  presented  the  background  for  the  research,  problem  statement  and  a 
way  ahead.  This  topic  is  very  important  to  the  future  of  AMC  planning  in  theater 
operations.  Chapter  two  discusses  the  literature  for  this  research  and  focuses  on 
applicable  areas  of  linear  regression  with  additional  review  for  future  research.  Chapter 
three  contains  a  discussion  and  explanation  of  the  methodology.  Chapter  four  captures 
the  analysis  of  the  information  generated  by  the  methodology.  Chapter  five  discusses 
conclusions  and  recommendations  for  AMC  and  future  research. 


2,  Literature  Review 


This  chapter  contains  many  techniques  and  areas  of  focus  to  analyze  airflow 
problems  and  cargo  loading  planning  techniques.  Initially  this  problem  was  thought  to 
have  more  of  a  focus  on  the  need  to  understand  actual  airflow  into  and  out  of  the 
Afghanistan  Theater  of  Operations.  After  much  study  and  analyzing,  this  problem  proved 
amenable  to  analysis  using  a  simple  linear  regression.  This  literature  review  has  a  limited 
discussion  on  the  airflow  into  and  out  of  the  theater  to  enlighten  future  scholars  on 
possible  ways  to  proceed  if  requested  or  needed  by  AMC  or  other  affdiates. 

2,1.  Linear  Regression 

Linear  regression  is  a  commonly  used  statistical  technique  to  analyze  a 
relationship  between  variables.  This  type  of  study  can  and  is  used  in  almost  every  type  of 
field.  The  results  have  been  proven  and  the  methodology  is  sound.  Two  books  are  used  in 
this  review.  One  is  “Introduction  to  Linear  Regression  Analysis”  (Montgomery,  Peck, 
Vining,  2001)  and  the  other  is  “Design  and  Analysis  of  Experiments”  (Montgomery, 
2009).  Both  of  these  books  have  many  good  points  to  focus  on  in  this  analysis,  but  the 
main  focus  in  the  review  is  on  the  “Introduction  to  Linear  Regression  Analysis”. 

Initially,  Montgomery  et  al.  (2001)  talks  about  data  collection  techniques.  There 
are  three  basic  methods  of  collecting  data:  a  retrospective  study  based  on  historical  data, 
an  observational  study,  and  a  designed  experiment  (Montgomery,  Peck,  Vining,  2001).  A 
historical  data  collection  is  needed  in  this  analysis;  therefore,  a  retrospective  study  is 
needed.  There  are  several  disadvantages  of  a  retrospective  study. 
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Some  of  the  relevant  data  often  are  missing.  The  reliability  and  quality  of  the  data 
are  often  highly  questionable.  The  nature  of  the  data  often  may  not  allow  us  to 
address  the  problem  at  hand.  The  analyst  often  tries  to  use  the  data  in  ways  they 
were  never  intended  to  be  used.  Logs,  notebooks,  and  memories  may  not  explain 
interesting  phenomena  identified  by  the  data  analysis  (Montgomery,  Peek, 

Vining,  2001,  p.8). 

These  shorteomings  are  not  all  apparent  in  every  historieal  observation  but  need  to  be 
kept  in  mind  while  condueting  an  analysis.  Some  of  these  problems  eould  lead  to  outliers. 


Simple  linear  regression  is  based  on  one  regressor  (x)  and  its  relationship  with  a 
response  variable  (y).  The  point  is  to  try  and  fit  a  line  by  using  the  data  to  show 
relationships  and  predict  outcomes.  This  leads  to  the  simple  linear  regression  model: 


The  technique  used  to  find  Po  and  Pi  is  the  method  of  least  squares;  estimate  the  Po  and  Pi 
so  that  the  sum  of  the  squares  of  the  differences  between  the  observations  y,  and  the 
straight  line  is  a  minimum  (Montgomery,  Peck,  Vining,  2001). 


2,2,  Multiple  Linear  Regression 

Multiple  linear  regression  is  a  focus  for  this  study.  This  is  because  many  different 
regressors  are  needed  to  understand  a  complicated  system.  A  simple  form  of  this  equation 
would  be 

This  is  called  a  multiple  linear  regression  model  with  k  regressors.  The 
parameters  Pj  (j=0,l,...,k)  are  called  the  regression  coefficients.  This  model 
describes  a  hyperplane  in  the  k-dimensional  space  of  the  regressor  variables  xj. 
The  parameter  Pj  represents  the  expected  change  in  the  response  y  per  unit  change 
in  Xj  when  all  of  the  remaining  regressor  variables  are  held  constant 
(Montgomery,  Peck,  Vining,  p.68,  2001). 
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Any  regression  model  that  is  linear  in  its  parameters  is  a  linear  regression  model, 
regardless  of  the  shape  of  the  surfaee  it  generates. 

Multiple  linear  regression  also  uses  the  method  of  least  squares  to  determine  the 
regression  coefficients  as  in  the  simple  linear  regression.  After  accomplishing  this,  there 
needs  to  be  a  check  of  statistical  significance  in  the  regression  model.  The  test  for 
significance  determines  if  there  is  a  linear  relationship  between  the  response  and  any 
regressor  variables.  For  this,  an  F  test  can  be  used  to  test  the  hypothesis  Hq:  Pi=  P2  =•  •  •= 
Pk  =0,  and  rejection  criteria  would  be  if  Fo>Fa,i,n-2  (Montgomery,  Peck,  Mining,  2001). 

2,3,  Checking  Model  Adequacy 

The  major  assumptions  in  regression  are  that  1)  the  observations  are  adequately 
described  by  the  model,  2)  the  errors  are  normally  distributed,  3)  the  errors  are 
independently  distributed,  4)  the  errors  have  a  constant,  but  unknown,  variance  and  5) 
that  the  errors  have  a  mean  of  zero  (Montgomery  2009).  These  are  very  important 
assumptions  that  need  to  be  checked  to  legitimately  make  statistical  inferences. 
Montgomery  also  discusses  ways  to  work  around  some  of  these  areas  if  they  are  not 
adequate. 

To  find  if  the  observations  are  adequately  encapsulated  in  the  model,  R  and 
adjusted  R  are  computed  (Montgomery  2009).  The  coefficient  of  determination,  R  : 

—  - .  “S  St  is  a  measure  of  the  variability  in  y  without  considering  the 

effect  of  the  regressor  variable  x  and  SSres  is  a  measure  of  the  variability  in  y  remaining 
after  X  has  been  considered”  (Montgomery,  Peck,  Mining,  p.  39,  2001).  R  could  be 
considered  the  proportion  of  variation  explained  by  x.  R  is  between  0  and  1  and  the 


11 


higher  the  number,  the  more  the  variability  is  explained  with  1 .0  being  a  perfect  fit.  The 

2  2 

R  equation  can  falter  with  numerous  regressors  (over  fit  the  model  and  inflates  R  ); 

therefore,  adjusted  R  was  developed.  The  adjusted  R  statistic  penalizes  the  model  for 
using  many  regressors.  - ,  where  n  is  the  number  of  observations 

and  p  is  the  number  of  regressors  (Montgomery  2009). 

To  check  for  normally  distributed  errors,  a  simple  test  of  the  normal  probability 
plot  of  residuals  is  accomplished.  If  the  residuals  fall  within  a  specific  distance  from  a 
straight  line  through  their  center,  they  are  assumed  to  be  normally  distributed.  Also,  the 
average  value  for  the  residuals  should  be  approximately  zero  (Montgomery  2009). 

Checking  for  independently  distributed  error  requires  a  plot  of  the  residuals  in 
time  sequence.  If  no  pattern  is  visible,  they  are  assumed  independent  (Montgomery 
2009).  Historical  data  can  be  assumed  to  be  independent  due  to  the  lack  of  ability  to  plan 
or  record  information. 

To  verify  that  the  errors  have  a  constant,  but  unknown,  variance,  a  plot  of 
residuals  verses  fitted  (or  predicted)  values  is  used.  The  model  is  correct  and  the 
assumption  holds  if  the  residuals  do  not  follow  any  pattern.  The  magnitude  of  the 
residuals  versus  the  predicted  values  should  be  relatively  constant  across  the  observations 
and  the  average  value  of  the  residuals  should  be  approximately  zero  (Montgomery  2009). 
If  this  is  not  confirmed,  variance  stabilizing  transformations  can  be  applied  to  the  Y 
variable  to  try  and  correct  the  problem.  This  is  seen  when,  after  the  transformation,  the 
data  is  more  symmetric  and  does  not  have  a  funnel  or  bow  shape. 
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There  are  many  different  types  of  transformations.  Montgomery  (2009)  discusses 
the  square  root,  logarithmic,  arcsine,  reciprocal  square  root,  reciprocal,  and  rank 
transformations.  He  also  discusses  the  use  of  the  Box-Cox  Method  to  estimate  the 
transformation  parameter  (Montgomery  2009).  One  additional  method  that  he  employs 
in  his  earlier  work  is  a  method  of  weighted  least  squares  (Montgomery,  Peck,  Vining 
2001).  All  of  these  methods  can  work  for  different  sets  of  data  based  on  their  individual 
relationships.  Finding  a  useful  transformation  can  make  all  the  difference  in  a  good 
analysis. 

2,4,  Outliers  &  Multicollinearity 

Detecting  outliers  and  multicollinearity  are  important  to  any  linear  regression 
analysis.  These  areas  can  point  to  fundamental  flaws  or  further  areas  to  analyze.  This 
additional  analysis  could  consist  of  eliminating  the  specific  data  point,  or  could  lead  to 
information  that  sheds  light  on  additional  areas  of  interest. 

Outliers  are  extreme  observations.  These  points  have  residuals  that  are  much 
larger  than  others.  Typically  they  are  three  to  four  standard  deviations  from  the  mean 
(Montgomery,  Peck,  Vining  2001).  These  points  are  not  representative  of  the  rest  of  the 
data  and  could  possibly  have  serious  effects  on  the  regression  model.  Montgomery 
suggests  using  scaled  residuals,  such  as  the  studentized  and  R-student  residuals.  Once 
found,  these  points  need  to  be  investigated.  Hopefully  the  reason  for  their  curious 
behavior  can  be  established.  If  there  was  an  error  in  collecting  the  observation,  this  error 
should  be  fixed  or  the  data  point  should  be  thrown  out  (Montgomery,  Peck,  Vining 
2001).  If  no  error  is  found  and  the  point  is  just  unusual,  then  it  should  be  kept  in  the 
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model.  “Deleting  these  points  to  ‘improve  the  fit  of  the  equation’  can  be  dangerous,  as  it 
can  give  the  user  a  false  sense  of  precision  in  estimation  or  prediction”  (Montgomery, 
Peck,  Vining,  p.l54,  2001). 

Specific  types  of  outliers  can  be  seen  as  leverage  or  influence  points.  These  points 
are  explicit  outliers  in  that  they  affect  the  model  differently  and  in  a  relatively  exact 
manner.  Leverage  points  are  points  that  lie  on  the  regression  line  and  do  not  affect  the 
regression  equation,  but  will  have  an  impact  on  statistics  such  as  R  .  Influence  points  pull 
the  regression  equation  in  its  direction.  Therefore,  it  is 
significantly  above  or  below  the  majority  of  the  points. 

The  knowledge  of  a  leverage  or  influence  point  does  not 
mean  to  discard,  but  as  with  other  outliers,  more 

investigation  of  those  points  needs  to  be  made  and  a  final 
determination  on  whether  to  leave  in  or  discard  should  be 
made  judiciously.  Cook’s  Distance  test  can  be  used  to  consider  both  the  location  of  the 
point  in  the  x-space  and  the  response  variable  in  measuring  influence.  This  “uses  a 
measure  of  the  squared  distance  between  the  least-squares  estimate  based  on  all  n  points 
and  the  estimate  obtained  by  deleting  the  zth  point,  say  ”  (Montgomery,  Peck, 
Vining,  p.212,  2001). 

Multicollinearity  occurs  when  two  or  more  regressors  in  a  multiple  regression  are 
highly  correlated.  Montgomery  states  when  “there  are  near  linear  dependencies  among 
the  regressors  the  problem  of  multicollinearity  exists”  (p.  325,  2001).  This  can  cause  the 
inferences  based  on  the  regression  model  to  be  flawed  or  misleading. 


Figure  2.1  Leverage  and 
Influence  Points 
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There  are  four  primary  sources  of  multicollinearity:  “the  data  collection  method 
employed,  constraints  on  the  model  or  in  the  population,  model  specification  and  an  over 
defined  model”  (Montgomery,  Peck,  Vining,  2001,  p.  326).  The  data  collection  method 
can  cause  this  to  occur  if  only  a  subspace  of  the  samples  is  taken.  Constraints  on  the 
model  or  in  the  population  can  also  cause  multicollinearity  by  using  regressors  that  are 
correlated.  Montgomery  uses  a  reference  between  family  income  and  house  size  as  two 
regressors  that  would  cause  multicollinearity  (2001).  Model  specification  by  the  choice  of 
model  can  cause  multicollinearity.  If  this  occurs,  look  at  the  specific  reasons  a  model  was 
chosen.  An  over  defined  model  has  more  regressors  than  observations  (Montgomery, 
Peck,  Vining  2001). 

Multicollinearity  is  one  reason  why  large  variances  and  covariances  can  occur  for 
the  least-squares  estimators  of  the  regression  coefficients.  “This  implies  that  different 
samples  taken  at  the  same  x  levels  could  lead  to  widely  different  estimates  of  the  model 
parameters”  (Montgomery,  Peck,  Vining,  2001,  p.329).  This  can  also  produce  least- 
squares  estimates  that  are  too  large  in  absolute  value. 

Detecting  multicollinearity  is  essential  to  understanding  the  multiple  regression 
model.  Montgomery  et  al.  (2001)  discusses  several  techniques  to  include,  the 
examination  of  the  correlation  matrix,  variance  inflation  factors  (VIF),  and  the 
eigensystem  analysis  of  X'X.  The  examination  of  the  correlation  matrix  involves  looking 
at  the  off  diagonal  elements  of  the  X'X  matrix.  If  the  absolute  value  is  close  to  1.0,  then 
there  is  a  strong  linear  dependence.  “The  VIF  for  each  term  in  the  model  measures  the 
combined  effect  of  the  dependencies  among  the  regressors  on  the  variance  of  that  term” 
(Montgomery,  Peck,  Vining,  2001,  p.337).  One  or  more  large 
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VIFs  indicate  multicollinearity.  Montgomery  et  al.  (2001)  states  from  praetieal 
experienee  that  if  a  VIF  exceeds  5  or  10,  then  the  assoeiated  regression  eoefficient  is 
poorly  estimated  beeause  of  multieollinearity.  The  eigensystem  analysis  of  the  X'X 
matrix  measures  the  extent  of  multieollinearity  in  the  data.  “If  there  are  one  or  more  near- 
linear  dependeneies  in  the  data,  then  one  or  more  of  the  eigenvalues  will  be  small” 
(Montgomery,  Peek,  Vining,  2001,  p.339). 

Montgomery  et  al.  (2001)  diseusses  multiple  ways  to  deal  with  multieollinearity. 
This  ean  be  aeeomplished  by  eolleeting  additional  data,  model  respecifieation,  or  ridge 
regression.  Collecting  additional  data  has  been  suggested  as  the  best  method  to  eombat 
multieollinearity  (Montgomery,  Peek,  Vining  2001).  This  should  be  colleeted  in  order  to 
break  up  the  multieollinearity  in  the  model. 

Although  multieollinearity  ean  produee  poor  estimates  of  the  individual  model 
parameters,  it  does  not  neeessarily  imply  that  the  fitted  model  is  a  poor  predietor.  “If 
predietions  are  eonfined  to  regions  of  the  x-space  where  the  multieollinearity  holds 
approximately,  the  fitted  model  often  produees  satisfaetory  predietions”  (Montgomery, 
Peek,  Vining,  2001,  p.330). 

2,5,  Variable  Selection  &  Model  Building 

Variable  seleetion  and  model  building  are  integral  to  analysis.  There  are  many 
methods  that  Montgomery  diseusses  to  find  the  best  regression  equation,  and  there  are 
advantages  to  all  of  them.  Some  of  the  ways  to  measure  and  determine  the  best  fit  and 
build  the  model  are  by  using  the  eoeffieient  of  determination  (R  ),  adjusted  R  ,  residual 
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2  2 

mean  square  (MSres),  Mallows’s  Cp  statistic,  and  AICc.  R  and  adjusted  R  were 
previously  discussed. 


The  residual  mean  square  is  - .  The  goal  of  this  is  to  minimize 

MSres  and  this  also  coincides  with  adjusted  R  when  it  is  at  its  maximum.  Mallow’s  Cp 
statistic  is  related  to  the  mean  square  error  of  a  fitted  value  and  looks  for  bias  in  the 

model,  -  .  Montgomery  et  al.  (2001)  states  small  values  of  Cp  are 

desirable.  Mallow  (1973)  states  that  minimizing  Cp  is  similar  to  a  stepwise  regression 
algorithm  and  that  the  smallest  or  negative  Cp  -  p  is  a  good  fit.  Azen  and  Budescu  (2009) 
show  that  Cp  ~  p  and  that  a  small  difference  shows  a  good  fit  with  no  bias  and  models 
with  Cp  >  p  have  some  bias. 

Akaike’s  corrected  information  criterion  (AICc)  is  a  biased  corrected  version  of 
Akaike’s  information  criterion  (AIC)  (Lindsey  and  Sheather,  2010). 


,  RSS 

AIC  =  n  log - 

n 


-f  2A’  -f  n  -f  71  log  ( 2  tt) 


AICc  =  AIC  + 


2(A-  +  2)(A-  +  3) 
n  -(A-  +  2)  -  1 


As  the  criterion  decreases,  the  model  becomes  more  desirable.  This  is  measured  by  the 
maximized  log  likelihood  of  the  predictor  coefficients  and  error  variance  (Lindsey  and 
Sheather,  2010).  This  number  does  not  have  a  value  in  magnitude  that  is  sought-after,  but 
the  lowest  value  of  all  the  AICc  is  the  most  desirable. 

There  are  many  computational  techniques  for  variable  selection.  Montgomery 
discusses  trying  all  possible  regressions  and  stepwise  regression.  The  all  possible 
regression  method  is  made  easier  with  strong  computer  programs  such  as  IMP  and 
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efficient  algorithms.  Montgomery  et  al.  (2001)  discusses  that  with  less  than  30  regressors, 
the  solve  time  is  relatively  easy  with  the  all  possible  regressions  approach. 

Stepwise  regression  breaks  down  into  three  specific  areas:  forward  selection, 
backward  elimination  and  stepwise  regression  (a  combination  of  the  first  two) 
(Montgomery,  Peck,  Vining,  2001).  Forward  regression  starts  with  zero  regressors  in  the 
model.  One  regressor  is  added  to  the  model  at  a  time.  The  first  regressor  selected  for 
entry  is  the  one  with  the  largest  simple  correlation  with  the  response  variable.  This 
regressor  is  entered  if  its  F  statistic  exceeds  a  specified  F  value.  The  second  regressor 
picked  for  entry  is  the  one  with  the  largest  correlation  with  the  response  after  adjusting 
for  the  effect  of  the  first  regressor,  and  if  its  F  statistic  exceeds  the  specified  F  value,  it  is 
also  added  (Montgomery,  Peck,  Vining,  2001).  This  continues  until  the  next  regressor 
with  the  largest  correlation  does  not  surpass  the  specified  F  value. 

Backward  elimination  uses  the  partial  F  statistic  as  well.  The  partial  F  statistic  is 
computed  for  each  regressor  as  if  it  were  the  last  variable  to  enter  the  model.  The  smallest 
of  these  partial  F  statistics  is  compared  with  a  preselected  F  value,  and  if  it  is  less  than 
that  value  it  is  removed.  This  continues  until  one  regressor’s  F  value  is  not  below  the 
specified  F  value  for  elimination  (Montgomery,  Peck,  Vining,  2001).  Stepwise  regression 
combines  both  of  these  methods  and  needs  an  F  value  for  including  and  another  F  value 
for  eliminating  from  the  model.  This  is  a  modification  of  forward  selection  in  that  it  starts 
with  zero  regressors  and  adds  them  as  in  the  forward  selection  method.  But  following  the 
inclusion,  the  backwards  method  is  checked  to  see  if  the  previous  regressor  should  be 
eliminated.  Frequently  the  choice  of  the  F  value  to  enter  is  higher  than  the  F  value  to 
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leave;  therefore,  it  is  “more  diffieult  to  add  a  regressor  than  to  delete  one”  (Montgomery, 
Peek,  Vining,  2001,  p.314). 

2,6.  Model  Validity 

Montgomery  discusses  three  validation  techniques  for  regression  models.  These 
are,  1)  “analysis  of  the  model  coefficients  and  predicted  values  including  comparisons 
with  prior  experience,  physical  theory,  and  other  analytical  models  or  simulation  results. 
2)  Collection  of  new  (or  fresh)  data  with  which  to  investigate  the  model’s  predictive 
performance.  3)  Data  splitting,  that  is,  setting  aside  some  of  the  original  data  and  using 
these  observations  to  investigate  the  model’s  predictive  performance”  (Montgomery, 
Peck,  Vining,  2001,  p.530). 

Analysis  of  the  model  coefficients  and  predicted  values  should  be  studied  to 
determine  if  they  are  stable  and  their  signs  and  magnitudes  are  reasonable.  “Previous 
experience,  theoretical  considerations,  or  an  analytical  model  can  often  provide 
information  concerning  the  direction  and  relative  size  of  the  effects  of  the  regressors” 
(Montgomery,  Peck,  Vining,  2001,  p.531).  The  VIF  can  also  be  used  as  a  guideline  as 
discusses  previously. 

Collecting  fresh  data  is  the  most  effective  way  of  validating  a  regression  model 
with  respect  to  its  predictive  performance  (Montgomery,  Peck,  Vining,  2001).  If  the 
model  gives  accurate  predictions  of  the  new  data,  these  confirmatory  runs  will  be  seen  as 
evidence  that  the  model  works.  Montgomery  et  al.  (2001)  recommends  at  least  15-20  new 
observations  to  get  a  reliable  assessment  of  performance. 
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Splitting  the  data  is  acceptable  if  collecting  new  data  is  not  possible.  When  this 
happens,  the  data  needs  to  be  split  into  two  parts,  the  estimation  data  and  the  prediction 
data  (Montgomery,  Peck,  Vining,  2001).  Careful  consideration  as  to  what  data  goes  into 
each  category  is  needed.  A  disadvantage  of  this  method  is  that  it  reduces  the  precision 
with  which  the  regression  coefficients  are  estimated  (Montgomery,  Peck,  Vining,  2001, 
p.537). 


2.7,  Integer  Programming  Techniques 

Integer  programming  techniques  to  solve  air  traffic  flow  management  problems 
have  been  studied  and  published  in  many  journals.  Integer  programming  has  advantages 
in  this  type  of  study.  One  is  that,  most  of  the  time,  a  closed  form  solution  can  be  found. 
Another  is  that  the  known  constraints  can  usually  be  accounted  for  accurately  and 
updated  in  a  timely  manner.  A  drawback  is  that,  due  to  the  size  of  the  network  and 
problem,  not  all  constraints  can  be  accounted  for. 

Bertsimas  and  Stock  (1998)  considered  the  air  traffic  flow  management  problem 
for  commercial  aircraft  and  used  an  integer  programming  method  to  increase 
optimization  of  air  traffic.  They  built  a  model  that  accounted  for  the  capacities  of  the 
National  Airspace  System  as  well  as  capacities  at  individual  airports.  Then,  they  solved  a 
large  scale  realistic  sized  problem  with  several  thousand  flights  which  significantly 
improved  the  state  of  the  system. 

This  study  included  a  reduced  problem  specific  to  a  ground  holding  problem. 

This  special  case  involved  only  the  departure  and  arrival  airport  and  had  significant 
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computational  advantages  over  the  larger  problem.  The  ground  hold  problem  is  in  line 
with  optimizing  ground  times  in  Afghanistan. 

Baker  et  al.’s  (2001)  artiele  on  optimizing  military  airlift  used  the  same  premise 
and  ineluded  a  mathematieal  formulation  with  very  specifie  eonstraints.  One  of  these 
eonstraints  dealt  with  airfield  parking  and  servieing  eapaeity  eonstraints.  These  mainly 
deal  with  the  number  of  parking  spots  at  the  airfield  and  if  fuel  or  other  serviees  are 
available.  Their  teehnique  and  eonstraining  process  are  useful  for  minimizing  ground 
times  in  Afghanistan. 

One  of  the  most  reeent  and  notable  artieles  is  “An  integer  programming  approaeh 
to  support  the  US  Air  Foree’s  air  mobility  network”  by  Koepke  et.  all  (2008).  This 
researeh  extended  Bertsimas  and  Stoek’s  study  to  the  Air  Foree.  Koepke  et  al.  used  a 
maximum  number  of  jets  on  the  ground  eomplianee  formula  (MCF)  in  order  to  suggest 
how  to  delay  aireraft  on  the  ground  to  avoid  a  violation  of  multiple  eonstraints.  This 
formulation  takes  into  aeeount  eonstraints  based  on  the  priority  of  the  mission,  diplomatie 
elearanees,  hazardous  eargo,  and  time  delays. 

2.8.  Simulations 

Simulation  methods  that  deal  with  the  mobility  airlift  problem  mostly  encompass 
the  entire  flow  of  cargo  and  aircraft  from  the  point  of  embarkation  to  the  point  of 
debarkation.  These  simulations  are  very  intrieate,  but  do  not  delve  into  the  preeiseness  of 
the  exaet  amount  of  time  one  aireraft  should  spend  on  the  ground  at  a  speeified  loeation. 
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Examples  of  these  simulations  are  MASS  (Mobility  Analysis  Support  System)  and 
AMOS  (Air  Mobility  Operations  Simulator). 

The  main  simulation  used  predominantly  by  AMC  is  the  AMOS.  This  is  a 
discrete-event  worldwide  airlift  simulation  model  used  in  strategic  and  theater  operations 
to  deploy  military  and  commercial  airlift  assets  (Wu  et  ah,  2009).  It  is  favored  because  of 
its  tremendous  flexibility  and  ability  to  handle  uncertainty.  But,  this  simulation  method 
requires  significant  input  by  the  user  to  specify  a  series  of  rules  to  obtain  realistic 
behaviors. 


2,9,  Stochastic  Models 

Ball  et  al.  (2003)  developed  a  stochastic  integer  program  with  dual  network 
structure  and  applied  it  to  the  ground  holding  problem.  This  paper  analyzed  a 
generalization  of  the  classic  network  flow  model.  It  also  shows  that  the  matrix  underlying 
the  stochastic  model  is  a  dual  network.  “Thus  the  integer  program  associated  with  the 
stochastic  model  can  be  solved  efficiently  using  network  flow  or  linear  programming 
techniques”  (Ball  et  ah,  2003). 

Mukherjee  and  Hansen  (2007)  developed  a  dynamic  stochastic  model  for  the 
single  airport  ground  handling  problem.  Their  stochastic  model  has  the  ability  to  account 
for  uncertainty  and  is  able  to  update  information  based  on  evolving  forecasts.  Basically,  it 
is  an  optimization  model  that  assigns  ground  delays  to  individual  aircraft  to  optimize 
some  objective  related  to  quantities  of  airborne  and  ground  delays.  This  allows  for 
revised  ground  delays  for  flights  that  have  not  taken  off  to  their  next  location.  The 
uncertainty  in  this  model  is  addressed  by  considering  a  finite  set  of  potential  scenarios  of 
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how  the  airfield  arrival  eapaeity  may  develop.  This  uncertainty  is  easier  to  understand  in 
the  commercial  environment  where  weather  is  the  major  uncertainty.  In  a  combat 
situation,  there  are  many  more  uncertainties  that  will  arise  as  aircraft  come  into  and  out  of 
theater. 


2,10,  Summary 

Chapter  two  summarized  literature  used  in  this  field  and  what  is  used  in  this 
research.  The  main  topics  included  linear  regression  and  applicable  themes  in  that  area  of 
study.  Additional  areas  of  study  are  incorporated  and  can  be  used  in  future  research. 
Chapter  three  uses  the  linear  regression  topics  and  expounds  on  how  they  are  used  for 
this  specific  research. 
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3,  Methodology 


3.1,  Data  Synchronization 

Two  data  bases  are  used  to  gather  needed  information  and  incorporate  all  of  the 
data  to  analyze  the  problem.  These  data  bases  include  the  Global  Decision  Support 
System  2  (GDSS  II)  and  the  Global  Air  Transportation  and  Execution  System  (GATES). 
These  two  data  bases  are  independent  systems  that  are  integrated  for  this  analysis. 

GDSSII  provides  an  enormous  amount  of  information  to  the  Air  Eorce  about 
specific  missions  that  are  accomplished  around  the  world.  A  subset  of  this  information 
includes  the  scheduled  arrival  time  per  mission,  scheduled  departure  time  per  mission, 
actual  arrival  time  per  mission,  actual  departure  time  per  mission,  mission  identification 
number,  arrival  location  (International  Civil  Aviation  Organization,  ICAO),  previous 
location  (ICAO),  next  location  (ICAO),  aircraft  type  (Mission  Design  Series,  MDS), 

Total  Passengers  (Pax),  Total  Cargo,  and  delay  remarks.  All  of  this  information  is 
important  for  this  analysis  and  was  pulled  from  the  system  for  the  months  of  January- July 
2010. 

GATES  also  provides  a  plethora  of  information  to  the  Air  Eorce  and  DoD 
partners  about  specific  loads  on  aircraft  throughout  the  world.  The  subset  of  GATES 
information  that  is  needed  for  this  analysis  includes:  mission  identification  number 
(Aerial  Port  of  Debarkation  Number  (APOD)  mission  number),  aircraft  type  (Mission 
Design  Series,  MDS),  APOD  ICAO,  number  of  passengers.  Pallet  net  Weights,  Pallet 
Type,  and  Equivalent  Pallet  Positions.  This  critical  information  was  pulled  from  the 
system  for  the  months  of  January- July  2010. 
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These  two  databases  are  synehronized  using  EXCEL  databases.  GDSSII  has  the 
ability  to  download  directly  into  EXCEL,  and  GATES  uses  a  MSACCESS  format  that  is 
downloaded  into  EXCEL.  These  databases  are  merged  using  the  mission  identification 
number  from  GDSSII  and  the  APOD  mission  number  from  GATES.  Pivot  tables  and 
lookup  functions  in  EXCEL  make  the  process  easier,  but  this  process  still  requires  a  very 
large  number  of  data  tables  in  EXCEL  to  properly  separate  and  merge  data.  These  final 
spreadsheets  include  36  columns  of  information  consisting  of  information  from  GDSSII 
and  GATES.  GDSSII  information  includes  the  Mission  number,  aircraft  type,  airfield, 
actual  departure  time  of  day  (Greenwich  Mean  Time),  scheduled  time  on  the  ground 
(mins),  actual  time  on  the  ground  (mins),  total  passengers  and  total  cargo  in  lbs,  delay 
codes  and  delay  remark.  GATES  information  includes:  equivalent  pallet  positions  for  the 
offload  (10  columns)  and  onload  (10  columns)  of  basic  cargo,  loose  stock,  palletized 
cargo,  rolling  stock,  standard  cargo,  and  pallet  trains  of  size  2,  3,  4,  5,  and  6,  total  cargo 
offloaded  in  equivalent  pallet  positions,  total  cargo  onloaded  in  equivalent  pallet 
positions,  passengers  offloaded,  passengers  onloaded,  total  passengers,  and  total  cargo  in 
equivalent  pallet  positions. 

The  merging  of  GATES  and  GDSSII  databases  shows  substantial  error.  Although 
these  systems  are  required  to  be  used  by  the  Air  Eorce  and  DoD,  they  do  not  match 
during  the  period  studied.  Eor  example,  GDSSII  recorded  8687  mission  numbers  while 
GATES  pulled  8369  mission  numbers  from  January- July  2010.  (GDSSII  does  include 
minor  cargo  and  passenger  information,  but  does  not  include  the  specific  cargo  and 
passenger  data  needed  to  accomplish  this  analysis).  The  information  from  Gates  is  broken 
into  6528  missions  with  cargo  information  and  4682  missions  with  passenger 
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information,  where  2841  missions  have  both  cargo  and  passenger  information.  When 
these  databases  are  merged,  the  data  must  be  limited  to  missions  that  are  in  both 
databases.  This  yields  an  overlap  of  7342  mission  numbers  from  GDSSII  and  GATES 
that  have  cargo  or  passenger  information.  This  can  be  seen  in  Figure  3.1. 

Jan-Jul  2010  OEF  Missions 


Figure  3,1  Data  Base  Merging 

This  data  is  broken  down  by  aircraft  type  and  airfield.  This  reduces  variance 
based  on  taxi  time,  cargo  capacity,  aircraft  capabilities,  aerial  port  capabilities  at  each 
airfield,  and  other  basic  mission  issues  that  are  specific  to  each  jet  at  each  airfield. 
Therefore,  eight  sets  of  data  are  analyzed.  These  sets  included  three  aircraft  types  (C-17, 
C-5  and  C-130)  at  three  airfields  (Bagram  AB,  Kandahar  AB,  and  Camp  Bastion 
Afghanistan).  Note,  C-5s  do  not  transit  Camp  Bastion.  Table  3.1  lists  the  data  sets. 
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Table  3.1  Data  Base  Description 


Bagram  Air  Base 

Kandahar  Air  Base 

Camp  Bastion 

C-17 

Data  Base  1 

Data  Base  2 

Data  Base  3 

C-130 

Data  Base  4 

Data  Base  5 

Data  Base  6 

C-5 

Data  Base  7 

Data  Base  8 

C-5s  do  not  transit 

Data  splitting  is  used  because  all  data  are  historical  and  there  are  seven  months  of 
data  with  thousands  of  data  points.  Six  months  of  data,  January  through  June,  are  used  to 
build  the  model.  The  seventh  month  is  used  to  validate  the  models.  6541  total  lines  of 
data  from  January  through  June  are  sifted  through  for  useful  mission  information.  The 
initial  data  statistics  are  shown  in  Table  3.2. 

Table  3,2  Lines  of  data  per  airfield  per  aircraft  type 


Bagram  Air  Base 

Kandahar  Air  Base 

Camp  Bastion 

C-17 

895 

839 

789 

C-130 

2132 

1275 

419 

C-5 

78 

114 

C-5s  do  not  transit 

Further  pruning  is  accomplished  based  on  delay  codes  and  delay  remarks  of 
missions  in  GDSSII.  Delay  codes  are  numbers  that  should  correspond  with  different 
reasons  for  late  departures.  After  analyzing  thousands  of  lines  of  data,  this  set  of 
supposedly  easy  to  use  information  is  deemed  unusable.  This  is  due  to  hundreds  of  the 
same  delay  codes  being  used  with  conflicting  delay  remarks,  i.e.  delay  code  201  would 
have  a  delay  remark  of  “no  delay”  or  delay  from  “previous  station”.  Therefore,  each 
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individual  delay  remark  needed  to  be  reviewed  and  fdtered  for  usefulness.  If  the  subjeet 
matter  expert  (SME)  thinks  the  delay  remarks  eause  a  signifieant  delay,  then  that  line  of 
data  is  unusable.  Many  of  the  delay  remarks  include  delays  for  maintenance,  human 
remains  movement,  weather,  flight  planning  delays/HHQ  taskings,  ramp  freezes, 
MEDEVACs,  double  blocking,  fueling,  ATC  congestion,  specific  user  delays  to  include 
distinguished  visitor  movements,  closure  of  the  runway  for  hostile  fire  and  many  other 
reasons. 


Table  3,3  Lines  of  data  without  delays 


Bagram  Air  Base 

Kandahar  Air  Base 

Camp  Bastion 

C-17 

635 

673 

649 

C-130 

1456 

822 

323 

C-5 

52 

79 

C-5s  do  not  transit 

Additional  pruning  needs  to  occur  for  lines  that  do  not  have  cargo  or  passenger 
information  (e.g.  the  mission  shows  zero  cargo  and  zero  passengers  moved).  This 
requires  sorting  by  total  cargo  and  then  sorting  by  total  passengers.  This  further  lowers 
our  usable  data  as  shown  in  Table  3.4. 

Table  3,4  Lines  of  data  with  cargo/passenger  information  without  delays 


Bagram  Air  Base 

Kandahar  Air  Base 

Camp  Bastion 

C-17 

529 

565 

589 

C-130 

1251 

728 

287 

C-5 

27 

48 

C-5s  do  not  transit 
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Other  areas  where  pruning  is  needed  are  data  lines  showing  more  pallet  positions 
or  passengers  earned  than  the  specifie  airframes  can  actually  carry  and  duplicate  mission 
numbers.  Data  lines  with  more  cargo  or  passengers  are  easily  deleted.  Some  duplicate 
mission  numbers  also  have  duplicate  cargo  information,  but  different  ground  times. 
These  missions  are  individually  examined  and  eliminated  based  on  delay  remarks.  This 
subsequently  lowers  the  available  data  to  the  numbers  in  Table  3.5. 

Table  3.5  Lines  of  data  with  cargo/passenger  information  without  delays 


Bagram  Air  Base 

Kandahar  Air  Base 

Camp  Bastion 

C-17 

439 

485 

570 

C-130 

736 

429 

195 

C-5 

27 

48 

C-5s  do  not  transit 

Supplementary  pruning  is  also  accomplished  based  on  actual  ground  times.  It  is 
observed  that  many  mission  numbers  are  associated  with  very  small  ground  times  but  still 
offload  and  onload  a  significant  amount  of  passengers  and/or  cargo.  These  missions  are 
intertwined  with  engine  running  offload  and  onloads.  It  is  also  observed  that  many 
missions  with  typical  offloads  and  onloads  are  on  the  ground  for  an  extended  period  of 
time  with  no  remarks  or  delays.  These  missions  are  deemed  by  the  SME  to  be  unrealistic 
and  to  have  an  error  that  is  unexplained  or  undocumented. 

Therefore,  C-17  missions  are  not  used  with  times  on  the  ground  below  60  minutes 
or  above  360  minutes  (6  hours).  The  SME  considers  60  minutes  the  lowest  value  that  a 
crew  can  taxi  in,  perform  normal  crew  duties  involving  engine  shutdown  and  startup,  taxi 
out  and  takeoff.  The  SME  also  considers  the  time  of  360  minutes  to  be  the  upper  limit  of 
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cargo  offloading  and  on  loading  for  extreme  eases.  One  of  these  eases  could  involve  the 
downloading  and  uploading  a  major  sized  helicopter.  In  order  to  use  the  same  rational 
with  the  C-130  and  C-5,  the  upper  time  limit  for  C-17  is  used  as  a  base  to  eliminate 
erroneous  data.  The  C-17  upper  limit  is  elose  to  2  standard  deviations  away  from  the 
mean  for  the  three  airfields.  Therefore,  for  C-5s  and  C-130s  times  above  two  standard 
deviations  away  from  the  mean  are  eonsidered  too  long  on  the  ground  and  therefore  have 
either  undocumented  delays  or  planned  ground  times  for  other  reasons  than  cargo. 

Table  3,6  Two  Standard  Deviations  above  tbe  mean  (minutes) 


Bagram  Air  Base 

Kandahar  Air  Base 

Camp  Bastion 

C-130 

246 

284 

157 

C-5 

628 

455 

C-5s  do  not  transit 

For  C-5s,  the  shortest  ground  time  is  only  a  faetor  for  one  mission  (40  mins  on  the  ground 
is  unrealistic  for  a  C-5  considering  taxi  and  erew  operations)  and  this  point  is  eliminated. 
For  C-130s,  the  shortest  ground  time  is  eonsidered  20  minutes  as  the  minimum  time  to 
taxi,  offload  or  onload,  and  takeoff.  This  is  used  instead  of  60  minutes  due  to  the  C-130s 
consistent  use  of  engine  running  offloads  and  onloads.  This  takes  the  total  numbers  for 
the  six  month  period  down  to  where  they  ean  be  introdueed  into  JMP,  see  Table  3.7.  The 
amount  of  data  lost  due  to  error  and  or  delays  is  significant.  See  Table  3.8  for  the 
pereentage  of  usable  data  from  January  to  June  2010. 
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Table  3.7  Lines  of  data  after  final  pruning 


Bagram  Air  Base 

Kandahar  Air  Base 

Camp  Bastion 

C-17 

374 

383 

388 

C-130 

574 

383 

173 

C-5 

24 

43 

C-5s  do  not  transit 

Table  3,8  Percent  of  usable  data  from  original  set  from  Jan-Jun  2010 


Bagram  Air  Base 

Kandahar  Air  Base 

Camp  Bastion 

C-17 

41.79% 

45.65% 

49.18% 

C-130 

26.92% 

30.04% 

41.29% 

C-5 

30.77% 

37.72% 

C-5s  do  not  transit 

3,2.  Regression 

Multiple  linear  regression  is  used  to  determine  the  optimum  ground  time  for 
speeifie  aircraft  at  specific  airfields  in  Afghanistan.  This  is  based  on  a  retrospective  study 
with  historical  data  from  January  -  July  2010.  While  using  a  retrospective  study,  it  is 
known  that  some  relevant  data  is  often  missing  and  the  reliability  and  quality  could  be 
questionable.  This  can  be  seen  with  the  GDSSII  and  GATES  databases  not  matching 
perfectly  and  some  data  missing  or  not  considered  valid  by  the  subject  matter  expert. 

The  JMP  program  is  used  in  order  to  accomplish  regression  due  to  the  number  of 
regressors  applied  to  this  complicated  system.  This  is  considered  a  linear  problem  due  to 
the  time  it  takes  to  offload  cargo  and  the  time  on  the  ground  in  Afghanistan  being 
considered  linear. 
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JMP’s  multiple  linear  regression  has  many  steps  to  aceomplish.  First,  the  data 
must  be  collected  and  entered  into  a  new  data  table.  This  is  accomplished  for  all  eight 
data  sets.  The  data  is  first  taken  out  of  the  36  columns  from  the  GDSSII  and  GATES 
merged  EXCEL  spreadsheets.  Both  columns  of  time  on  the  ground  in  minutes  are  entered 
into  the  data  table,  with  the  actual  times  on  the  ground  used  as  the  Y  variable. 

This  problem  uses  22  regressors.  They  include  varying  types  of  cargo  offloaded 
and  onloaded  at  each  location,  along  with  the  number  of  passengers  offloaded  and  then 
onloaded  at  each  station.  There  are  ten  different  types  of  cargo  that  is  categorized  by  the 
GATES  system.  They  include,  belly  cargo  (BC),  loose  stock  (LS),  rolling  stock  (RS), 
palletized  cargo  (PC),  skid  cargo  (SD),  and  pallet  trains  consisting  of  two  -  six  pallets 
tied  together  as  one  pallet  (T2,  T3,  T4,  T5,  T6).  This  makes  up  20  of  the  regressors  (ten 
during  offload  and  ten  during  onload),  each  taking  a  different  amount  of  time  to 
accomplish.  Each  one  of  these  types  of  cargo  is  given  an  equivalent  pallet  position  in 
GATES.  This  means  that  for  a  certain  type  of  cargo,  e.g.  a  HMMWV  as  rolling  stock 
taking  up  two  pallet  positions  on  an  airframe,  it  is  counted  as  the  number  of  pallet 
positions  it  displaces  on  each  aircraft.  This  is  the  number  that  is  used  to  analyze  the 
system.  Weight  was  initially  used,  but  due  to  the  variance  in  weight  per  pallet  position, 
number  of  pallet  positions  is  a  much  better  factor  for  time  on  the  ground.  Eor  example,  it 
takes  the  same  time,  manpower  and  equipment  to  push  a  pallet  that  weighs  100  lbs  as  it 
does  to  push  one  that  weighs  20001bs.  Einally,  passengers  offloaded  and  passengers 
onloaded  are  the  last  two  regressors,  each  taking  a  different  amount  of  time  to 
accomplish.  Passengers  are  counted  individually. 
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The  labels  used  in  JMP,  and  for  the  actual  columns  in  the  EXCEL  database, 
include  the  following:  Sched  Time  on  Ground  mins,  Actual  Time  on  Ground  mins  (Y 
variable),  BC  off,  LS  off,  PC  off,  RS  off,  SD  off,  T2  off,  T3  off,  T4  off,  T5  off,  T6  off, 
BC  on,  LS  on,  PC  on,  RS  on,  SD  on,  T2  on,  T3  on,  T4  on,  T5  on,  T6  on,  pax  offloaded, 
and  pax  onloaded.  Any  type  of  cargo  followed  by  an  “off’  is  considered  occurring  in  the 
offload  phase  of  operations  and  any  type  of  cargo  followed  by  an  “on”  is  considered 
occurring  in  the  onload  phase  of  the  mission. 

Once  the  data  are  collected  into  a  new  data  table  in  JMP,  the  computational 
technique  of  stepwise  regression  is  used  for  all  eight  scenarios.  This  is  accomplished  by 
selecting  the  Analyze  tab,  then  Pit  Model.  Then,  actual  time  on  the  ground  in  mins  is 
selected  as  the  Y  variable  and  the  22  regressors  are  selected  as  construct  model  effects. 
Next,  stepwise  is  selected  and  run  with  p-value  thresholds  of  0.1  for  the  probability  to 
enter  and  0.05  for  the  probability  to  leave.  Mixed  stepwise  is  selected  and  run.  Interaction 
of  these  regressors  were  examined  initially  and  found  to  have  no  significance,  therefore 
they  are  not  considered  in  this  analysis. 

The  mixed  stepwise  regression  uses  the  forward  selection  and  backward 
elimination  together  to  select  only  the  best  regressors  that  create  the  strongest  model. 

This  starts  with  zero  regressors  in  the  model.  One  regressor  is  added  to  the  model  at  a 
time  and  then  checked  to  make  sure  it  stays  in  the  model.  The  first  regressor  selected  for 
entry  is  the  one  with  the  largest  simple  correlation  with  the  response  variable.  This 
regressor  is  entered  if  its  P  statistic  exceeds  the  specified  p-value  threshold  (0.1). 
Pollowing  the  inclusion,  the  model  is  changed  and  then  backwards  method  is  checked  to 
see  if  the  previous  regressor  should  be  eliminated.  If  the  P  statistic  is  below  0.05,  it 
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should  then  be  eliminated.  This  is  based  on  the  new  model;  therefore,  the  F  statistic  for 


this  new  model  is  needed.  This  continues  until  neither  the  selection  p-value  threshold  or 
elimination  p-value  thresholds  are  met.  This  creates  the  strongest  model.  This  stepwise 
regression  produces  a  Sum  of  Squares  Error,  Degrees  of  Freedom  for  Error,  the 
coefficient  of  determination  (R  ),  adjusted  R  ,  Mallows’s  Cp  statistic  and  AICc.  Adjusted 
R  is  closely  analyzed  due  to  the  large  number  of  regressors. 

From  the  stepwise  fit  screen.  Make  Model  is  selected.  This  keeps  actual  time  on 
the  ground  as  the  Y  variable  and  uses  the  regressors  selected  in  the  stepwise  regression  to 
construct  the  model  effects.  Then,  least  squares  is  run  to  find  the  following  information: 
Summary  of  fit  to  include,  the  coefficient  of  determination  (R  ),  adjusted  R  ,  Root  Mean 
Square  Error,  Mean  of  the  Response,  Observations,  an  ANOVA  table,  parameter 
estimates.  Residuals  by  Predicted  plot,  actual  by  predicted  plot,  leverage  plots  and  lack  of 
fit  table. 


3,3,  Checking  Model  Adequacy 

Checking  the  models  adequacy  is  accomplished  next.  The  major  assumptions  of 
regression  are  checked:  1)  the  observations  are  adequately  described  by  the  model;  2)  the 
errors  are  normally  distributed;  3)  the  errors  are  independently  distributed;  4)  the  errors 
have  a  constant,  but  unknown,  variance;  and  5)  the  errors  have  a  mean  of  zero 
(Montgomery  2009). 

Checking  for  normally  distributed  errors,  a  simple  test  of  the  normal  probability 
plot  of  residuals,  is  accomplished.  In  JMP,  the  Normal  Quantile  Plot  is  used  as  the  normal 
probability  plot  of  residuals.  These  two  plots  are  the  same,  but  use  different  scales.  If  the 
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residuals  fall  within  a  specific  distance  from  a  straight  line  through  their  center,  they  are 
assumed  to  be  normally  distributed.  Also,  the  average  value  for  the  residuals  should  be 
approximately  zero  (Montgomery  2009).  The  specific  distance  is  called  the  “fat  pencil 
test”.  If  the  data  points  fall  within  a  pencil  thickness  distance,  then  they  will  be  assumed 
normal  with  slight  divergence  at  the  lower  and  upper  ends.  This  chart  is  accessed  in  JMP 
by  saving  the  residuals  as  a  column  in  the  original  data  table.  After  returning  to  the 
original  data  table,  analysis  of  the  distribution  of  the  new  regression  column  is  conducted 
and  the  normal  quantile  plot  is  analyzed.  The  residuals  are  examined  for  a  mean  of  zero. 

Checking  for  independently  distributed  error  requires  a  plot  of  the  residuals  in 
time  sequence.  If  no  pattern  is  visible,  they  are  assumed  independent  (Montgomery 
2009).  In  this  analysis,  historical  data  is  assumed  to  be  independent  due  to  the  lack  of 
ability  to  plan  or  record  information. 

To  satisfy  that  the  errors  have  a  constant,  but  unknown,  variance,  a  plot  of 
residuals  verses  fitted  (or  predicted)  values  is  used.  The  model  is  correct  and  the 
assumption  holds  if  the  residuals  do  not  follow  any  pattern.  The  magnitude  of  the 
residuals  versus  the  predicted  values  should  be  relatively  constant  across  the  observations 
and  the  average  value  of  the  residuals  should  be  approximately  zero  (Montgomery  2009). 
This  is  seen  in  the  fit  of  the  least  squares  with  the  residuals  by  predicted  plot.  A  pattern  in 
all  of  the  scenarios  is  not  readily  apparent.  Still,  transformations  are  accomplished  to 
attempt  to  alleviate  any  possible  patterns.  The  transformations  discussed  in  Montgomery 
(2009)  and  available  in  JMP  are  the  square  root,  logarithmic  and  reciprocal. 
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3,4,  Outliers 


Upon  initial  review  there  many  outliers  in  this  data.  These  points  have  residuals 
that  are  mueh  larger  than  others.  Typically  they  are  three  to  four  standard  deviations  from 
the  mean  (Montgomery,  Peck,  Vining  2001).  The  residuals  in  each  scenario  are  analyzed 
and  standard  deviations  between  three  and  four  are  considered.  The  studentized  residuals 
are  also  plotted  versus  predicted  values  to  look  for  outliers.  These  points  are  not 
representative  of  the  rest  of  the  data  and  could  possibly  have  serious  effects  on  the 
regression  model. 

If  no  error  is  found  and  the  point  is  just  unusual,  then  it  should  be  kept  in  the 
model.  “Deleting  these  points  to  ‘improve  the  fit  of  the  equation’  can  be  dangerous,  as  it 
can  give  the  user  a  false  sense  of  precision  in  estimation  or  prediction”  (Montgomery, 
Peck,  Vining,  p.l54,  2001).  The  SME  decides  if  these  points  should  stay  in  the  model  or 
be  eliminated. 

3,5,  Multicollinearity 

Multicollinearity  is  examined  to  find  correlation  between  regressors.  When  “there 
are  near  linear  dependencies  among  the  regressors  the  problem  of  multicollinearity 
exists”  (Montgomery,  Peck,  Vining,  p.  325,  2001).  This  can  cause  the  inferences  based 
on  the  regression  model  to  be  flawed  or  misleading. 

Areas  for  this  study  that  are  looked  into  are  the  data  collection  method  employed 
and  model  specification.  The  data  collection  method  can  cause  multicollinearity  to  occur 
if  only  a  subspace  of  the  samples  is  taken  (Montgomery,  Peck,  Vining,  2001).  This 
definitely  occurs  in  the  data  due  to  GATES  and  GDSSII  not  matching.  Therefore,  some 
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data  are  lost  creating  a  subset  of  the  population.  Therefore,  some  multicollinearity  may  be 
present.  Model  specification  is  also  a  possible  source  of  multicollinearity  and  is  looked 
at  if  strong  correlations  exist  between  variables. 

Detecting  multicollinearity  is  based  on  the  examination  of  the  factor  correlation 
matrix  and  studying  the  variance  inflation  factors  (VIF).  The  examination  of  the 
correlation  matrix  involves  looking  at  the  off  diagonal  elements  of  the  X'X  matrix 
(Montgomery,  Peck,  Vining,  2001).  If  the  absolute  value  is  close  to  1.0  then  there  is  a 
strong  linear  dependence.  In  IMP,  this  is  accomplished  by  using  the  regressors  found 
from  the  stepwise  regression  in  each  scenario  in  JMP’s  multivariate  analysis  tool.  This 
yields  a  correlation  matrix  and  scatterplot  matrix  for  all  used  regressors  and  the  Y 
variable. 

The  VIF  was  examined  for  each  scenario.  One  or  more  large  VIFs  indicate 
multicollinearity.  IMP  finds  the  VIF  by  using  the  inverse  correlation  matrix.  This  is 
accomplished  using  the  same  multivariate  tools  as  in  the  correlation  matrix  and 
scatterplot.  The  diagonal  elements  of  (X’X)"'  are  the  VIF  values.  Values  below  five  are 
considered  not  to  have  multicollinearity. 

3,6,  Model  Validity 

IMP  creates  a  prediction  expression  for  each  scenario.  This  prediction  expression 
uses  the  intercept  and  parameter  estimates  for  the  regressors  found  during  the  stepwise 
regression.  This  model  is  then  checked  using  the  predicted  value  functionality  of  IMP  as 
compared  to  the  actual  times  on  the  ground.  The  prediction  error  sum  of  squares  (PRESS 
statistic)  is  analyzed.  The  PRESS  statistic  is  the  sum  of  the  squared  PRESS  residuals  and 
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measures  model  quality  (Montgomery,  Peck,  Vining,  2001).  Small  values  of  PRESS  are 
desired. 

The  Box-Cox  transformation  is  used  to  stabilize  variance  issues  in  the  data.  This 
is  accomplished  by  applying  the  Box-Cox  transformation  to  the  Y  variable  to  correct  for 
any  possible  non-constant  variance.  JMP  allows  this  by  selecting  the  Box  Cox  Y 
transformation  after  the  regression  is  run.  This  shows  a  plot  of  the  Sum  of  Squares  Error 
by  k.  After  viewing  the  table  of  estimates,  the  best  transformation  is  saved  to  the  initial 
data  table  and  reviewed. 

Next,  the  model  is  analyzed  with  the  seventh  month  of  data  that  was  split  from  the 
original  data.  This  prediction  expression  is  with  each  scenario’s  data.  This  results  in  a 
predicted  time  on  the  ground  for  each  aircraft  at  each  field.  The  predicted  times  are  then 
compared  with  the  actual  times  on  the  ground.  A  paired  t  test  of  both  sets  of  data  is 
accomplished  to  gain  understanding  and  determine  if  the  model  is  valid.  Results  that 
show  no  statistical  difference  in  the  predicted  and  actual  times  yield  a  regression  model 
that  can  be  applied  to  real  world  operations. 

3.7,  Summary 

The  next  chapter  presents  results  and  analysis.  The  statistical  techniques  presented  in 
this  chapter  are  applied.  Results  are  shown  and  the  predictive  capabilities  of  the  various 
models  are  tested.  Additional  impact  to  AMC’s  worldwide  airlift  operations  are  analyzed 
and  shown. 
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4,  Results  and  Analysis 


This  chapter  describes  the  results  found  and  analysis  conducted  for  each  of  the 
eight  scenarios.  First,  the  stepwise  regression  is  shown,  and  the  Sum  of  Squares  Error, 
Degrees  of  Freedom  for  Error,  the  coefficient  of  determination  (R  ),  adjusted  R  ,  and 
Mallows’s  Cp  statistic  are  reported.  Next,  the  model  is  built  and  checked  for  adequacy. 
Then,  outliers  and  multicollinearity  are  considered.  Einally,  each  scenario’s  model  is 
checked  for  validity. 


4,1,  Stepwise  Regression 

Stepwise  regression  is  conducted  in  all  eight  scenarios.  This  is  accomplished 
using  JMP  and  the  mixed  method  of  stepwise  regression.  This  gives  the  most  suitable 
model  based  on  the  p-value  of  0.1  for  inclusion  and  0.05  for  exclusion.  Appendix  A 
provides  the  output  for  all  eight  scenarios.  Table  4.1  summarizes  the  stepwise  findings 
for  each  scenario. 


Table  4,1  Stepwise  Regression  Results 


Scenario 

adj 

Intercept 

#  Parameters 

Cp 

AlCc 

min  AlCc 

1 

C-17  OAIX 

0.257 

0.249 

172.7 

5 

5.3 

4034 

4032 

2 

C-17  OAKN 

0.532 

0.525 

167.5 

7 

5.2 

4147 

4145 

3 

C-17  OAZI 

0.305 

0.301 

161.1 

4 

3.7 

4965 

4965 

4 

C-130  OAIX 

0.109 

0.102 

57.9 

5 

-2.5 

5682 

5682 

5 

C- 130  OAKN 

0.144 

0.135 

105.5 

5 

7.9 

4332 

4330 

6 

C-130  OAZI 

0.091 

0.08 

51 

3 

4.7 

1557 

1557 

7 

C-5  OAIX 

0.487 

0.438 

364.4 

3 

1.7 

272 

272 

8 

C-5  OAKN 

0 

0 

266.1 

1 

-4.7 

475 

475 

Information  in  Table  4.1  illustrates  that  some  of  the  models  are  initially  better 
than  the  others.  All  models  that  are  more  closely  examined  have  a  R  above  0.25;  the 
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other  models  are  not  continued  in  the  data  analysis.  The  C-130  scenarios  were  not 
continued  due  to  low  variance  in  their  ground  time  based  on  cargo.  This  is  based  on  C- 
130s  use  of  engine  running  cargo  operations  and  their  small  and  quick  cargo  offload  and 
onloads.  Model  eight  is  not  a  good  model  due  to  zero  regressors  making  the  stepwise 
significance  for  inclusion. 

2  2 

Adjusted  R  is  provided  at  in  Table  4. 1 .  In  all  scenarios,  the  adjusted  R  closely 

2  2 

matches  the  R  .  The  adjusted  R  values  penalize  for  the  addition  of  multiple  regressors 
that  inflate  R  .  Therefore,  based  on  the  R  and  adjusted  R  ,  and  the  eliminated  scenarios 
for  this  research,  models  1,  2,  3  and  7  are  continued. 

Next,  Mallow’s  Cp  statistic  is  reviewed  from  Table  4.1.  As  Montgomery  et  al. 
(2001)  stated,  small  values  of  Cp  are  desirable.  This  statistic  should  be  close  to  p  for  a 
good  model  and  favorably  lower  in  value  than  p.  All  scenarios  meet  this  qualification 
with  the  exceptions  of  1,  5,  and  6,  but  all  of  the  scenarios  fall  under  the  2p  to  be  an 
acceptable  level  of  bias  (Mallows,  1973).  Therefore  all  scenarios  are  acceptable  in  regard 
to  Mallow’s  Cp  statistic. 

Finally,  AICc  is  evaluated  as  a  measure  of  goodness  of  fit.  All  of  the  scenarios 
fall  within  1-2  of  the  minimum  AICc.  This  shows  that  there  is  relatively  no  difference 
between  the  AICc  of  the  model  and  the  minimum  AICc.  Therefore,  each  model  can  be 
used  to  make  inferences  on  the  scenarios. 

4.2,  Normal  Standard  Least  Squares 

The  stepwise  regression  outcomes  are  put  into  the  actual  standard  least  squares 
model  in  IMP.  This  is  accomplished  by  confirming  the  Y  variable  and  regressors  and 
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selecting  the  standard  least  squares,  make  model  option  in  the  stepwise  regression  screen. 
The  output  can  be  seen  in  Appendix  B.  Figures  B1-B8  represent  scenarios  1-8.  This 
output  includes  the  Actual  by  Predicted  Plots,  Summary  of  Fit,  Analysis  of  Variance, 
Lack  of  Fit,  Parameter  Estimates  and  Residual  by  Predicted  Plot.  Figure  B8  is  blank  due 
to  the  lack  of  regressors  chosen  to  enter  from  the  stepwise  regression. 

4,3,  Model  Adequacy 

The  information  provided  by  JMP  in  the  normal  standard  least  squares  is  used  to 
determine  model  adequacy.  This  is  accomplished  to  see  that:  1)  the  observations  are 
adequately  described  by  the  model,  2)  the  errors  are  normally  distributed,  3)  the  errors  are 
independently  distributed  (assumed  due  to  historical  data),  4)  the  errors  have  a  constant, 
but  unknown,  variance,  and  5)  the  errors  have  a  mean  of  zero. 

To  determine  if  the  observations  are  adequately  described  by  the  model,  adjusted 
R  is  reviewed.  Each  model  has  specific  constants  that  create  longer  or  shorter  times  on 
the  ground  and  create  an  environment  where  more  variance  can  be  explained  by  the 
model.  Therefore  these  scenarios  cannot  be  compared  to  each  other  based  on  purely  R  . 
The  initial  R  values  for  the  scenarios  continued  are  above  0.25.  This  is  adequate  for  each 
model. 

Errors  that  are  normally  distributed  can  be  seen  using  the  normal  quantile  plot 
from  JMP.  This  is  accomplished  for  models  1,  2,  3,  and  7  in  Eigures  4.1  thru  4.4.  All 
normal  quantile  plots  fall  within  a  reasonable  distance  from  a  straight  line  through  their 
center  (pass  the  fat  pencil  test)  with  allowable  trailing  data  on  the  extremities.  Therefore, 
all  are  assumed  to  be  normally  distributed. 
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Figure  4.1  C-17  OAIX  Normal  Plot 
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Figure  4.3  C-17  OAZI  Normal  Plot 
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A  ^  j,-,  ^  .  r..  .L  Figure  4.4  C-5  OAIX  Normal  Plot 

Figure  4.2  C-17  OAKN  Normal  Plot  ^ 

JMP  produces  the  plot  of  residuals  versus  predicted  values  to  examine  constant, 


but  unknown,  variance.  This  can  be  seen  in  Figures  4.5  thru  4.8.  The  residuals  do  not 


follow  any  real  particular  pattern  such  as  a  funnel,  megaphone  or  bowing.  There  does 


appear  to  be  evidence  of  missing  data.  This  is  inevitable  when  using  historical  data  that  is 


highly  erroneous.  There  is  also  evidence  of  possible  split  data  set.  Looking  through  the 


data  intensely,  there  is  no  evidence  or  similarities  that  are  observed  to  split  the  data  in 
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order  to  separate  the  two  data  sets  as  in  Figure  4.5.  Anything  nonlinear  should  be 
addressed  and  a  transformation  should  be  accomplished  to  alleviate  any  non-constant 
variance.  Due  to  the  erroneous  nature  of  the  data  and  the  slight  downward  trend  in  the 
data,  transformations  are  accomplished  in  order  to  alleviate  any  non-constant  variances. 


Figure  4,5  C-17  OAIX  Residuals  by 
Predicted  Plot 


Figure  4,7  C-17  OAZI  Residuals  by 
Predicted  Plot 
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Figure  4,6  C-17  OAKN  Residuals  by 
Predicted  Plot 


Figure  4,8  C-5  OAIX  Residuals  by 
Predicted  Plot 


Transformations  used  include  the  square  root,  logarithmic,  and  reciprocal.  These 
transformations  are  shown  below  for  the  C-17  at  OAZI.  This  is  the  case  with  the  most 
variance  resembling  a  semi-funnel  shape.  From  looking  at  Figures  4.9-4. 1 1  and 
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reviewing  the  data  from  the  regression  using  the  transformations,  the  square  root 
transformation  reduced  the  variance  the  greatest  amount.  Therefore,  the  square  root 
transformation  is  used  on  all  four  remaining  models. 


6  Residual  by  Predicted  Plot 


ACTUAL  TIME  ON 
GROUND  mins  Predicted 


Figure  4,9  C-17  Square  Root  Figure  4,10  C-17  Log  Transformation 

Transformation 


Figure  4,11  C-17  Reciprocal  Transformation 

The  square  root  transformation  output  is  displayed  in  Appendix  C.  A  summary  of 

2 

this  data  is  presented  in  Table  4.2.  This  table  shows  slight  increases  and  decreases  in  R 


44 


2 

and  adjusted  R  and  the  charts  show  less  tunneling  or  bowing  effects.  Therefore,  constant 
variance  is  mostly  achieved. 


Table  4,2  Square  Root  Transformation 


Scenario 

adj 

Intercept 

#  Parameters 

RMSE 

Obs 

Variance 

1 

C-17  OAIX 

0.271 

0.263 

12.89 

5 

2.1 

374 

Constant 

2 

C-17  OAKN 

0.512 

0.505 

12.82 

7 

1.9 

383 

Constant 

3 

C-17  OAZI 

0.266 

0.258 

12.65 

4 

2 

457 

Constant 

7 

C-5  OAIX 

0.605 

0.567 

18.99 

3 

1.5 

24 

Constant 

Determining  the  residual  errors  mean  is  accomplished  by  saving  the  residuals  in  a 
column  in  the  JMP  spreadsheet.  Then,  the  column  is  transferred  into  the  EXCEL  program 
and  averaged.  In  all  cases,  the  average  is  zero  or  an  extremely  small  number  that 
approximates  zero.  The  JMP  distribution  fitting  tool  also  shows  a  mean  of  zero. 

Therefore,  the  last  assumption  holds  for  the  four  remaining  scenarios  and  these  models 
check  for  adequacy. 


4,4,  Outliers 

Outliers  are  reviewed  using  the  above  three  standard  deviations  method  and 
studentized  residuals.  When  residuals  fall  more  than  three  standard  deviations  from  the 
mean,  or  they  are  shown  as  outliers  on  the  plot  of  studentized  residuals,  the  actual  data 
points  are  looked  at  for  errors.  If  the  SME  believes  an  error  has  occurred,  the  point  is 
eliminated.  Table  4.3  shows  the  summary  of  data  points  found  to  be  potential  outliers 
from  both  methods.  Eigures  4.12-4.15  show  the  actual  studentized  residuals  versus 
predicted  values  with  the  outliers  in  grey  and  circled. 
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Table  4,3  Summary  of  Outlier  Detection 


Scenario 

>  3  Std  Dev 

Studentized  Plot 

SME  deteremined  Error 

new 

new  adj 

1 

C-17  OAIX 

0 

0 

0 

N/A 

N/A 

2 

C-17  OAKN 

2 

0 

1 

0.504 

0.498 

3 

C-17  OAZI 

2 

2 

0 

N/A 

N/A 

7 

C-5  OAIX 

0 

1 

0 

N/A 

N/A 

I),  C-17  OAIX  ALL  DATA  SORTED  -  Scatterplot ...  Ls 
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Figure  4,12  C-17  OAIX  Outliers 


Figure  4,13  C-17  OAKN  Outliers 
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Figure  4,14  C-17  OAZI  Outliers 

%  C-5  OAK  ALL  DATA  SORTED  -  Scatterplot ...  1'^ 


4  ''  Scatterplot  Matrix 


Predicted  ACTUAL 
TIME  ON  GROUND  mins 


Figure  4,15  C-5  OAIX  Outliers 
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There  are  no  outliers  found  for  Seenario  1  in  either  method.  In  Seenario  2,  there 
are  2  outliers  found  using  the  standard  deviation  method  and  none  found  using  the  plot  of 
studentized  residuals  verse  predicted  values.  The  SME  determined  that  one  point  is 
erroneous  and  the  other  is  unusual.  The  point  that  is  erroneous  listed  that  it  had  1 1 
equivalent  pallet  positions  offloaded.  These  consisted  of  one  T-2  and  three  T3  pallet 
trains.  This  is  not  a  possible  combination  in  the  C-17  and  therefore  the  data  points  are 
erroneous  and  the  point  should  be  excluded.  The  second  point  was  on  the  ground  for  an 
extended  amount  of  time  with  135  passengers  offloaded  and  95  onloaded.  This  is  slightly 
unusual,  but  not  erroneous  so  this  point  is  kept. 

The  same  two  potential  outliers  for  Scenario  3  were  found  in  both  methods.  These 
both  had  minimal  ground  times  of  60  and  61  minutes  with  a  large  PC  offload  of  18  and 
17  pallets,  respectively,  and  no  onload.  The  SME  determined  that  this  is  not  an  unusual 
occurrence  to  have  a  large  offload  and  no  onload  at  OAZI  and  no  errors  were  found  in  the 
data  or  delay  remarks.  Therefore,  these  points  were  maintained  in  the  model. 

The  potential  outlier  for  Scenario  7  was  only  found  using  the  studentized  residuals 
versus  the  predicted  values.  This  point  is  deemed  unusual  but  not  erroneous  by  the  SME. 
This  point  had  a  small  ground  time  (108  minutes)  and  a  relatively  small  offload  (17 
equivalent  pallet  positions)  with  no  onload;  therefore,  it  is  slightly  unusual  but  not 
erroneous. 


4,5,  Multicollinearity 

In  order  to  look  for  possible  multicollinearity,  the  remaining  four  scenarios’ 
correlation  matrix,  and  variance  inflation  factors  (VIE)  are  reviewed.  The  correlation 
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matrices  are  shown  in  Tables  4. 4-4. 7.  The  largest  VIFs,  taken  from  the  diagonal  of  the 
inverse  eorrelation  matrices  are  listed  in  Table  4.8.  The  matrices  only  involve  the 
regressors  found  in  the  stepwise  regression. 


Table  4,4  Scenario  1  C-17  OAIX 


ACTUAL  TIME 
ON  GROUND 

PC  off 

RS  off 

T6  on 

pax  offloaded 

Act  Time  on  Gnd 

1.0000 

0.1618 

0.2414 

-0.0663 

-0.4689 

PC  off 

0.1618 

1.0000 

-0.3048 

0.0001 

-0.1308 

RS  off 

0.2414 

-0.3048 

1.0000 

0.1043 

-0.3477 

T6  on 

-0.0663 

0.0001 

0.1043 

1.0000 

-0.0354 

pax  offloaded 

-0.4689 

-0.1308 

-0.3477 

-0.0354 

1.0000 

Table  4,5  Scenario  2  C-17  OAKN 


ACTUAL  TIME 
ON  GROUND 

PC  off 

T2  off 

PC  on 

RS  on 

T3  on 

pax  offloaded 

Act  Time  on  Gnd 

1.0000 

0.6710 

0.1105 

-0.1823 

-0.1803 

-0.0782 

-0.2826 

PC  off 

0.6710 

1.0000 

-0.0194 

-0.0771 

-0.0766 

0.1207 

-0.2991 

T2  off 

0.1105 

-0.0194 

1.0000 

0.0796 

-0.0021 

-0.0239 

-0.0831 

PC  on 

-0.1823 

-0.0771 

0.0796 

1.0000 

0.0204 

-0.0070 

-0.1211 

RS  on 

-0.1803 

-0.0766 

-0.0021 

0.0204 

1.0000 

-0.0277 

-0.0804 

T3  on 

-0.0782 

0.1207 

-0.0239 

-0.0070 

-0.0277 

1.0000 

-0.0584 

pax  offloaded 

-0.2826 

-0.2991 

-0.0831 

-0.1211 

-0.0804 

-0.0584 

1.0000 

Table  4,6  Scenario  3  C-17  OAZI 


ACTUAL  TIME 
ON  GROUND 

PC  off 

RS  on 

pax  offloaded 

Act  Time  on  Gnd 

1.0000 

0.5360 

-0.1222 

-0.2273 

PC  off 

0.5360 

1.0000 

-0.0348 

-0.2801 

RS  on 

-0.1222 

-0.0348 

1.0000 

-0.0256 

pax  offloaded 

-0.2273 

-0.2801 

-0.0256 

1.0000 

Table  4,7  Scenario  7  C-5  OAIX 


ACTUAL  TIME 
ON  GROUND 

BC  off 

PC  off 

Act  Time  on  Gnd 

1.0000 

-0.5851 

-0.2957 

BC  off 

-0.5851 

1.0000 

-0.1376 

PC  off 

-0.2957 

-0.1376 

1.0000 
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Table  4,8  VIF  Scenarios  1,  2,  3,  and  7 


Scenario  1 
C-17  OAIX 

Scenario  2 
C-17  OAKN 

Scenario  3 
C-17  OAZI 

Scenario  7 

C-5  OAIX 

Largest  VIF 

1.4332 

2.1747 

1.4581 

1.9475 

Multicollinearity  is  not  seen  from  either  the  eorrelation  matriees  or  inverse 
eorrelation  matriees.  This  is  seen  in  the  eorrelation  matriees  with  no  regressor 
eorrelations  greater  than  0.34.  The  inverse  eorrelation  matriees  show  no  value  greater 
than  2.1747.  Any  values  lower  than  five  are  not  eonsidered  to  show  multieollinearity. 
Therefore,  there  is  no  evidenee  of  multieollinearity  in  any  of  the  remaining  scenarios. 

4,6,  Model  Validity 

Model  validity  is  checked  in  three  manners.  First,  the  prediction  expression  is 
checked  using  the  predicted  values  from  the  test  data  versus  the  actual  values  using  a 
paired  t  test  and  a  95%  confidence  interval.  Next,  the  prediction  error  sum  of  squares 
(PRESS)  statistic  is  analyzed  for  each  regression  scenario.  Finally,  the  regression 
equation  is  used  with  the  split  data  to  compare  the  means  of  the  actual  versus  the 
predicted  values.  A  Box-Cox  transformation  is  also  attempted  to  increase  prediction 
capability. 

The  first  step  involves  testing  the  regression  equations  against  the  actual  values 
that  derived  the  equation.  (This  is  similar  to  the  residuals  from  the  regression.)  The 
hypothesis  is  that  the  means  should  be  the  same.  The  prediction  expression  found  in  the 
final  standard  least  squares  run  for  each  scenario  is  run  using  the  actual  cargo  and 
passenger  numbers  from  the  initial  data  to  show  the  predicted  values.  These  values  are 
compared  to  the  actual  ground  times  using  a  paired  t  test  with  an  alpha  level  of  0.05. 
They  are  also  compared  using  a  95%  confidence  interval  (Cl).  The  95%  Cl  should 


49 


encapsulate  zero.  If  there  is  a  significant  difference  or  the  95%  Cl  does  not  encapsulate 
zero,  the  regression  equation  is  not  useful.  The  results  from  the  paired  t  tests  are 
displayed  in  Table  4.9,  where  Hq:  p  =  0  and  Hai  p  ^  0. 


Table  4.9  Paired  t  test  and  95%  Cl  for  original  data  using  regression  equations 


Scenario 

t  Stat 

P(  T  <  t)  two- 
tail 

Mean 

difference 

95%  Confidence 
Interval 

1 

C-17 

OAIX 

1.228 

0.2199 

-4.35 

(-9.68 , 0.98) 

2 

C-17 

OAKN 

0.703 

0.4825 

-3.40 

(-8.69,  1.88) 

3 

C-17 

OAZI 

1.189 

0.2352 

-4.09 

(-9.12,0.92) 

7 

C-5  OAIX 

0.103 

0.9187 

-2.05 

(-26.5 , 22.4) 

From  the  paired  t-test  performed  to  determine  if  the  means  are  different,  it  can  be 


surmised  that  the  null  hypothesis  is  not  rejected.  Also,  each  95%  Cl  included  zero; 
therefore,  the  mean  difference  between  the  two  data  sets  is  not  significantly  greater  than 
zero.  This  was  a  good  result  and  should  have  occurred  because  the  prediction  data  was 
used  to  derive  the  regression  equation. 

Next,  the  prediction  error  sum  of  squares  (PRESS)  statistic  is  analyzed.  The 
PRESS  statistic  is  found  in  the  output  of  each  scenario  in  JMP.  This  is  chosen  using  the 
response  selection.  The  PRESS  value  for  each  model  is  shown  in  Table  4.10.  Small 
values  of  PRESS  are  favorable.  From  the  PRESS  statistics  in  Table  4.10,  it  is  seen  that  all 
are  very  close  to  SSE.  Therefore,  all  models  have  a  good  PRESS  statistic. 


Table  4,10  PRESS  Statistic 


Scenario 

PRESS  statistic 

SSE 

1 

C-17  OAIX 

1657 

1626 

2 

C-17  OAKN 

1361 

1303 

3 

C-17  OAZI 

1908 

1878 

7 

C-5  OAIX 

58 

49 
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The  final  step  ineludes  using  the  derived  predietion  expressions  with  the  split  data 


for  the  month  of  July  2010.  The  regression  equations  should  result  in  planned  ground 
times  that  are  not  significantly  different  from  the  actual  ground  times.  The  hypothesis  is 
that  the  means  should  be  the  same.  The  prediction  expression  found  in  the  final  standard 
least  squares  run  for  each  scenario  is  run  using  the  actual  cargo  and  passenger  numbers 
from  the  July  data  to  show  the  predicted  values.  These  values  are  compared  to  the  actual 
times  using  a  paired  t  test  with  an  alpha  level  of  0.05.  They  are  also  compared  using  a 
95%  confidence  interval  (Cl).  The  95%  Cl  should  encapsulate  zero.  If  there  is  a 
significant  difference  and  the  95%  Cl  does  not  encapsulate  zero,  the  prediction 
expression  is  not  useful.  The  distribution  of  the  actual  ground  time  in  July  was  also 
analyzed  and  is  similar  to  the  original  data.  The  prediction  expression  for  each  scenario  is 
shown  in  Table  4.1 1.  The  results  from  the  paired  t  tests  are  displayed  in  Table  4.12, 
where  Hq:  p  =  0  and  Hai  p  ^  0. 


Table  4,11  Prediction  Expressions  Square  Root  Transformation 


Intercept 

(minutes) 

BC  off 

PC  off 

RS  off 

T2  off 

PC  on 

RS  on 

T3  on 

T6  on 

Pax  off 

C-17 

OAIX 

12.89 

0.0739 

0.1002 

-0.796 

-0.0161 

=  12.89  +  0.07387  *  PC  off  +  0.1002  *  RS  off  -  0.7955  *  T6  on  -  O.C 

11606  *  Pax  off 

C-17 

OAKN 

12.82 

0.2382 

0.2657 

-0.119 

-0.1962 

-0.848 

-0.0066 

=  12.82  +  0.2382  *  PC  off  +  0.2657  *  T2  off  -  0.1 19  *  PC  on  -  0.1962  *  RS  on  -  0.848*T3  on  -  0.0066  Pax  off 

C-17 

OAZI 

12.65 

0.1762 

-0.4077 

-0.0052 

=  12.648 

+  0.1762  *  PC  off- 0.4077  * 

RS  on  -  0.00516  *  Pax  off 

C-5 

OAIX 

18.99 

-0.532 

-0.092 

=  18.99  -  0.5318  *  BC  off  -  0.092  *  PC  off 
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Table  4.12  Paired  t  test  and  95%  Cl  for  Prediction  Expressions  with  July  data 


Scenario 

Aircraft 

Airfield 

Ohs 

t  Stat 

P(  T  <  t) 
two-tail 

Mean 

difference 

95%  Confidence 
Interval 

1 

C-17 

OAIX 

197 

4.3 

0.00002 

-21.3 

(-29.2  ,  -13.3) 

2 

C-17 

OAKN 

185 

-1.8 

0.07 

12.24 

(2.5 ,21.98) 

3 

C-17 

OAZI 

65 

0.14 

0.89 

1.48 

(-17.2 , 20.2) 

7 

C-5 

OAIX 

13 

-2.1 

0.057 

80.78 

(-1.8  ,  163.4) 

From  these  results,  it  is  apparent  that  models  for  Seenarios  3  and  7  are  the  only 
two  models  that  have  signifieanee  with  both  the  t  test  and  the  95%  CL  This  is  displayed 
by  both  the  two-tail  being  above  0.05  and  the  95%  Cl  ineluding  zero.  Seenario  2  shows 
signifioanee  in  the  t  test,  but  not  in  the  95%  CL 

Next,  a  Box  Cox  transformation  is  eonducted  to  alleviate  more  variance  and 
increase  the  predictability  of  the  models.  This  is  accomplished  through  JMP  using  the 
best  Box  Cox  transformation  that  JMP  produces.  Table  4.13  shows  the  results  from  using 
a  Box  Cox  transformation  instead  of  the  Square  Root  transformation. 


Table  4,13  Paired  t  test  and  95%  Cl  using  Box  Cos  Transformation 


Scenario 

Aircraft 

Airfield 

Ohs 

t  Stat 

p(T<t) 

two-tail 

Mean 

difference 

95%  Confidence 
Interval 

1 

C-17 

OAIX 

197 

4.24 

0.000028 

-21.02 

(-29.0  ,  -13.04) 

2 

C-17 

OAKN 

185 

-1.51 

0.13 

10.25 

(0.38 ,20.13) 

3 

C-17 

OAZI 

65 

-0.29 

0.77 

3.05 

(-15.6 ,21.7) 

7 

C-5 

OAIX 

13 

-1.57 

0.14 

61.46 

(-23.4 , 146) 

The  Box  Cox  slightly  increases  the  significance  in  Scenarios  2  and  7,  but  decreased  in 
Scenario  3. 


52 


The  prediction  expression  values  for  Scenarios  2,  3,  and  7  are  also  compared  with 
the  actual  ground  times  in  July  to  realize  potential  savings  in  ground  time.  During  the 
months  of  January-June  2010,  the  average  early  and  late  times  over  all  missions  were  36 
minutes  early  and  45  minutes  late.  The  models  predicted  values  over  the  month  of  July 
decreases  the  minutes  early  to  28  and  the  minutes  late  to  39.  This  does  not  look  like  a 
significant  change,  but  over  the  course  of  a  month,  with  around  200  C-17  and  C-5 
missions  through  the  Scenarios,  this  equates  to  lowering  the  error  in  planning  by  4445 
minutes  or  74.1  hours. 

This  does  not  necessarily  mean  that  throughput  will  be  increased  or  decreased.  As 
seen  with  the  prediction  expressions  in  Table  4.9,  the  intercept  (once  squared)  for  the  C- 
17  is  averaging  2.73  hours  ±  regressors  and  the  C-5  is  averaging  6  hours  ±  regressors  on 
the  ground.  This  is  an  increase  from  the  maximum  planning  ground  times  of  2.25  for  C- 
17s  and  3.25  for  C-5s.  Actual  values  for  throughput  follow:  C-17s  at  Camp  Bastion 
average  3  C-17s  per  day  with  a  maximum  in  2010  of  9  C-17s  in  one  day,  C-17s  at 
Kandahar  AB  average  4  C-17s  per  day  with  a  maximum  in  2010  of  13  C-17s  in  one  day, 
and  C-5s  at  Bagram  AB  average  1.4  C-5s  per  day  with  a  maximum  in  2010  of  5  C-5s  in 
one  day.  The  average  and  maximum  amount  of  aircraft  transiting  these  airfields  on  any 
given  day  leave  room  for  the  possible  increased  ground  time.  Using  the  models  with 
maximum  on  the  ground  values  at  each  location,  it  is  seen  that  the  maximum  throughput 
for  a  given  day  for  C-17s  at  Camp  Bastion  is  18  C-17s,  for  C-17s  at  Kandahar  AB  is  26 
C-17s,  and  for  C-5s  at  Bagram  AB  is  8  C-5s.  These  numbers  are  almost  all  over  double 
the  maximum  amount  of  aircraft  throughput  in  2010.  Therefore,  throughput  should  not  be 
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affected  by  the  new  model  and  the  entire  process  should  become  more  predictable  and 
stable. 

4,7.  Summary 

The  results  from  Chapter  4  show  that  ground  times  can  be  accurately  predicted 
using  historical  cargo  data  and  ground  times  in  three  of  the  eight  scenarios.  The 
remaining  five  scenarios  do  not  have  significance  to  predict  ground  times.  Scenarios  3 
and  7  hold  the  strongest  significance  with  both  the  t  test  and  the  95%  Cl  showing 
accurate  prediction  capability.  Scenario  2  shows  significance  in  the  t  test  but  not  the  95% 
CL  This  result  is  suitable  to  be  used  in  future  predictions.  Therefore,  the  linear  regression 
model  built  for  Scenario  2  could  be  used  to  accurately  predict  C-17  ground  times  at 
OAKN.  Chapter  5  summarizes  the  conclusions  drawn  from  this  research,  outlines  the 
obstacles  to  implementation  of  results,  and  suggests  future  areas  of  research  that  broaden 
the  scope  of  this  research  effort. 
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5,  Discussion 


5.1.  Conclusion 

Cargo  aircraft  provide  essential  military  supplies  to  Afghanistan  around  the  elock. 
Aceurate  seheduling  of  ground  times  in  theater  is  eritieal  to  providing  needed  supplies  to 
eombat  troops  in  an  orderly  manner.  While  foeusing  on  historieal  data  from  the  GATES 
and  GDSSII  data  systems,  a  linear  regression  model  was  developed  to  model  aeeurate 
ground  time  prediction  using  three  different  aireraft  and  airfields,  with  eight  total 
seenarios.  Three  of  these  seenarios  resulted  in  useful  models  that  were  validated  using 
split  historical  data.  These  seenarios  give  the  mission  planners  at  TACC  a  more  aeeurate 
tool  to  predict  ground  times.  These  ean  be  used  in  the  future  to  stabilize  ground  times  in 
theater  and  schedule  aireraft  in  a  more  aeeurate  and  effieient  manner. 

5.2.  Unexplained  Variance 

Throughout  this  study,  there  were  some  factors  during  some  phases  of  the  mission 
from  landing  to  takeoff  that  were  assumed  eonstant.  These  faetors  may  not  have  been 
exaetly  eonstant  and  therefore  eould  have  led  to  unexplained  varianee.  The  phases 
inelude  landing,  taxiing  into  park,  offloading,  onloading,  taxiing  for  takeoff,  and  takeoff, 
that  led  to  unexplained  varianee  that  impacted  some  seenarios.  Some  of  the  faetors  eould 
inelude  motivation  in  either  aerial  port  erews  or  aireraft  erews,  aerial  port  overtasking, 
and  deployment  rotations  to  name  a  few. 

Motivation  by  an  aerial  port  erew  or  an  aireraft  erew  ean  lead  to  signifieant 
varianee.  For  example,  if  the  aerial  port  and  the  crew  are  very  motivated,  the  ground  time 
eould  be  minimal.  This  is  espeeially  the  ease  if  the  erew  requests  an  early  takeoff  from 
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TACC.  Conversely,  a  not  so  motivated  port  crew  and  aircraft  crew  could  lead  to  much 
longer  ground  time  than  is  expected.  Any  combination  of  these  factors  will  lead  to 
unexplained  variance  that  definitely  affects  the  outcome  of  this  type  of  model. 

Aerial  port  overtasking  is  another  factor  of  unexplained  variance.  This  could 
result  from  numerous  complications.  One  factor  is  not  enough  ground  time  for  one 
aircraft  on  the  ground.  Another  is  a  severe  maintenance  problem.  Both  of  these  can  back 
up  the  entire  field  for  hours  or  days.  Other  factors  could  include  under  manning  or  over 
manning  of  aerial  port  crews. 

Deployments  are  a  constant  cause  of  variance  in  theater.  Four  to  six  month 
deployments  result  in  a  learning  curve  for  all  Airmen  that  handle  this  process.  Great 
lengths  are  taken  to  alleviate  any  of  this  learning  curve,  but  it  still  occurs  in  the  system 
and  is  mostly  unknown  during  this  analysis. 

5,3,  Errors  in  Data  Bases 

The  GATES  and  GDSSII  data  bases  have  a  large  amount  of  error.  This  can  be 
seen  in  almost  all  aspects  of  the  system.  Main  areas  of  error  were  due  to  controllers 
changing  scheduled  times  in  GDSSII,  GATES  not  accurately  depicting  significant 
differences  in  cargo  type,  or  erroneous  or  missing  delay  remarks. 

Some  results  were  not  able  to  be  drawn  due  to  controllers  changing  scheduled 
takeoff  times  in  the  system.  If  the  mission  is  slipped  for  some  reason  while  on  the  ground, 
the  scheduled  takeoff  time  should  not  be  changed.  The  actual  takeoff  time  will  reflect  this 
and  the  delay  codes  or  remarks  should  give  the  reason  for  the  change.  This  is  a  large 
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source  of  error  and  eliminated  a  potential  avenue  to  explore  seheduled  ground  times 
versus  actual  ground  times  in  regards  to  the  eargo  on  board. 

The  cargo  needs  to  be  better  defined  in  GATES.  The  rolling  stock  data  should  be 
ehanged  to  include  three  signifieantly  different  types  of  cargo  as  well  as  partieular 
outsized  cargo,  which  takes  a  significantly  longer  amount  of  time  to  offload  and  upload. 
This  would  eliminate  a  large  amount  of  varianee  in  offload  and  upload  times. 

Rolling  stock  needs  to  account  for  drivable  rolling  stock,  rolling  stock  that  needs 
to  be  winched  on  and  off  the  aireraft,  and  rolling  stoek  that  needs  shoring.  All  three  of 
these  types  of  rolling  stock  take  signifieantly  different  amounts  of  time  to  onload  and 
offload.  For  example,  drivable  rolling  stoek  (i.e.  ears)  ean  be  easily  offloaded,  while  a 
heavy  power  eart  may  need  to  be  winehed  or  towed  onto  the  aircraft.  Shoring  is  needed 
when  the  clearanee  of  the  vehicle  going  on  or  off  the  aireraft  is  too  low  for  the  angle  of 
the  ramp,  therefore  pieces  of  wood  are  needed  to  decrease  the  angle  of  the  cargo  ramp. 
This  takes  a  longer  time  to  organize  and  put  together.  Therefore,  clumping  all  of  these 
items  into  one  category  is  not  useful  for  this  analysis. 

Helieopters  and  other  outsized  eargo  known  to  have  a  longer  onload  or  offload 
time  also  need  to  be  eategorized  differently.  This  is  due  to  the  increased  amount  of  time 
needed  to  safely  move  such  cargo.  This  would  reduce  error  and  variance  in  the  system. 

5.4,  Limitations 

The  significant  models  should  be  used  by  experienced  mission  planners.  Not  all 
scenarios  are  derived  and  tested  throughout  the  results.  For  example,  there  is  an  error 
bound  that  needs  to  be  assessed  by  the  individual  mission  planner  that  they  see  as 


57 


acceptable  for  the  mission  at  hand.  Some  ground  times  from  these  equations  may  seem 
very  long  due  to  eertain  regressors  that  add  signifioant  time  if  the  aireraft  is  entirely  full 
of  one  type  of  eargo.  This  should  be  serutinized  by  AMC/A9  in  the  aeereditation  phase. 

There  are  definitely  many  more  factors  that  the  mission  planners  may  need  to 
eonsider  when  determining  ground  time.  The  results  from  the  models  should  be  used  as  a 
base  from  whieh  the  mission  planner  ean  expound.  Some  factors  may  occur 
simultaneously,  while  others  may  require  additional  ground  time  when  eargo  loading  and 
unloading  are  eompleted.  This  is  up  to  the  individual  mission  planner,  and  eventually,  up 
to  the  aireraft  eommander  to  implement  the  appropriate  ground  time  in  a  safe  manner. 

5,5,  Recommendations 

This  analysis  resulted  in  many  reeommendations.  They  range  from  improving  the 
accuracy  of  data  collection,  additional  items  for  data  eolleetion,  method  of  inputting  data, 
method  to  sehedule  C-130  ground  times  speeifieally,  and  the  use  of  the  resulting  models. 
All  of  these  reeommendations  would  enhanee  USTRANSCOM  operations  around  the 
world. 

Throughout  this  study,  it  was  noted  that  many  improvements  in  the  data  eolleetion 
proeess  would  have  led  to  more  signifieant  results.  This  ean  be  seen  from  inaeeurate  data 
points  throughout  both  databases.  A  more  stringent  approach  to  accurate  data  eolleetion 
needs  to  be  made  by  USTRANSCOM.  The  amount  of  erroneous  data  in  the  system,  at  a 
minimum,  eost  the  tax  payers  millions  each  year.  Effective  data  eolleetion  eould  reduee 
the  amount  of  Aireraft  needed  in  theater  by  helping  to  build  more  aeeurate  models  and 
analytieal  tools. 
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Additional  data  needs  to  be  collected.  This  data  includes  the  time  Aerial  Port 


begins  offload,  finishes  offload,  begins  onload,  and  finishes  onload.  These  times,  along 
with  relatively  constant  times  for  taxi  and  crew  duties,  could  be  analyzed  with  total  time 
on  the  ground  to  determine  a  better  fit  for  the  ground  time  model. 

Data  collection  should  be  improved  using  a  more  reliable  system.  This  should 
involve  some  type  of  electronic  device  tied  to  the  current  system.  Alternatively,  a  more 
updated  data  system  that  is  taken  with  aerial  port  crews  on  every  offload  and  onload 
could  be  used.  The  port  crews  could  verify  all  cargo  present  and  record  the  start  and  stop 
times  of  the  offloads  and  onloads.  This  device  could  transfer  the  data  electronically  into 
the  database  and  therefore  alleviate  the  current  transfer  error.  This  type  of  technology  can 
be  seen  on  the  C-17.  The  C-17  is  equipped  with  a  computer  system  that  can  automatically 
report  land  times,  fuel  on  board,  and  takeoff  times  to  TACC.  This  data  collection 
capability  and  emphasis  needs  to  be  transferred  to  all  aspects  of  the  Air  Mobility 
Command  process  to  include  port  crews. 

C-130s  should  use  the  current  system  they  have  in  place.  The  current  C-130 
system  uses  a  historical  database  of  how  long  specific  types  of  cargo  loads  have  taken. 
The  Combined  Air  Operations  Center  C-130  mission  planners  apply  these  times  to 
predict  relatively  accurate  ground  times.  This  success  is  also  due  to  the  C-130 ’s  small 
cargo  loads.  C-130s  generally  have  short  and  almost  identical  ground  times  no  matter 
what  type  or  how  much  cargo  is  offloaded  or  onloaded.  This  resulted  in  no  significance  in 
the  linear  regression  models.  The  other  components  of  ground  time  and  the  C-130’s 
tendency  to  conduct  engine  running  offloads  and  onloads  are  more  influential  in 
determining  ground  times.  Since  these  times  are  mostly  constant  for  actions  not  including 
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loading  and  unloading,  ground  times  are  mostly  identical  for  all  types  of  cargo. 

Therefore,  the  current  system  already  creates  a  stable  environment  with  the  subject  matter 
experts  planning  missions. 

The  significant  models  should  be  used  by  AMC/A9.  Once  accredited  by 
AMC/A9,  these  models  should  be  used  by  TACC  in  a  test  manner.  The  results  of  these 
models  should  create  a  more  accurate  account  for  ground  times  for  C- 17s  at  Kandahar 
and  Camp  Bastion  and  C-5s  at  Bagram.  Although  significance  was  not  found  in  all 
models,  they  should  be  used  when  applicable  to  create  a  more  stable  air  mobility  system. 
Not  counting  the  C-130s,  there  were  a  total  of  five  scenarios.  Three  of  the  models  for 
these  scenarios  were  found  to  be  significant. 

5.6.  EXCEL  Based  Tool 

An  EXCEL  based  tool  was  designed  and  built  for  AMC/A9  and  TACC  planners. 
This  tool  is  used  to  predict  ground  times  based  on  the  regression  expressions  found  from 
the  models  of  three  scenarios  with  significance.  Easy  to  use  operations  are  critical  to 
quickly  and  effectively  planning  operations  in  a  wartime  environment.  The  “AMC  OEE 
Ground  Time  Predictor”  has  an  easy  to  use  interface  in  the  EXCEL  program.  Any  TACC 
planner  can  use  this  tool  to  predict  C-17  ground  times  at  Kandahar  AB  and  Camp  Bastion 
or  C-5  ground  time  at  Bagram.  The  only  inputs  that  are  required  are  the  type  of  aircraft, 
location,  and  the  equivalent  pallet  positions  for  the  regressors  used  in  each  equation.  The 
remaining  regressors  are  not  used  and  therefore  are  not  included  in  the  tool.  This  tool  is 
shown  in  Appendix  D. 
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5,7.  Future  Research 


Proper  data  collection  with  a  linear  regression  analysis  and  models  that  fits  the 
data  more  accurately  could  be  re-accomplished  for  these  airfields  and  aircraft.  The 
research  could  also  be  expanded  to  include  all  Afghanistan  airfields  and  other  airfields  in 
Iraq.  This  data  could  also  be  placed  into  a  much  larger  model  for  aircraft  throughput.  This 
research  could  expound  on  past  and  current  integer  programming,  simulation,  and 
stochastic  techniques.  The  output  from  these  updated  models  would  yield  more  accurate 
airflow  through  US  TRANS  COM’s  combat  and  global  environment. 
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Appendix  A:  Stepwise  Regression  Output 


Figure  A,l,  Stepwise  Regression  Scenario  1 
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Figure  A.2.  Stepwise  Regression  Scenario  2 
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Figure  A,3,  Stepwise  Regression  Scenario  3 
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Figure  A.5.  Stepwise  Regression  Scenario  5 
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^  Stepwise  Regression  Control _ 
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46  rows  not  used  due  to  excluded  rows  or  missing  values. 
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Figure  A.6.  Stepwise  Regression  Scenario  6 


■  C-130OAZI  ALL  DATA  SORTH)  -  Rt  Stepwise  -  JMP 


^1^  Stepwise  Fit  for  ACTUAL  TIME  ON  GROUND  mins 
A  Stepwise  Regression  Control 


Stopping  Rule: 
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22  rows  not  used  due  to  excluded  rows  or  missing  values. 
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Figure  A.7.  Stepwise  Regression  Scenario  7 


■  C-5  OAK  ALL  DATA  SORTED  -  Rt  Stepwise -JMP 


^|ig  Stepwise  Fitfor  ACTUAL  TIME  ON  GROUND  mins 
a\  Stepwise  Regression  Control 
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3  rows  not  used  due  to  excluded  rows  or  missing  values. 
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Figure  A.8.  Stepwise  Regression  Scenario  8 


■  C-5  OAKN  ALL  DATA  SORTID  -  Fit  Stepwise  -  JMP 


^  ?  Stepwise  Fit  for  ACTUAL  TIME  ON  GROUND  mins 
Stepwise  Regression  Controi 
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5  rows  not  used  due  to  excluded  rows  or  missing  values. 
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Appendix  B:  Standard  Least  Squares  Output 


Figure  B,l,  Scenario  1 


I  C-17  OAK  AU  DATA  SORTED  -  Fit  Least  Squares  -  JMP 


^  T  Response  ACTUAL  TIME  ON  GROUND  mins 


<( Whole  Model 


^  Summary  of  Fit 


RSquare  0.256776 

RSquare  Adj  0.24872 

Root  Mean  Square  Error  52.74347 

Mean  of  Response  168.8289 


l[ Parameter  Estimates  | 

Term 

Estimate 

Std  Error 

t  Ratio 

Prob>|t| 

Intercept 

172.6641 

5.307362 

32.53 

<.0001* 

PC  otf 

1.7744912 

0.54255 

3.27 

0.0012* 

RS  off 

2.297414 

0.731978 

3.14 

0.0018* 

T6  on 

-19.08218 

8.856097 

-2.15 

0.0318* 

pax  offloaded 

-0.39493 

0.049672 

-7.95 

<0001* 

Observations  (or  Sum  Wgts) 

374 

1  Analysis  of  Variance 
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Figure  B.2.  Scenario  2 


■  C-17  OAKN  ALL  DATA  SORTtP  -  Rt  Least  Squares  -  JMP 


■j|aResponse  ACTUAL  TIME  ON  GROUND  mins 


RSquare  0.532187 

RSquare  Adj  0.524721 

Root  Mean  Square  Error  53.65616 
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0.054386 

-3.38 

0.0008* 

>[ Effect  Tests] 


^Analysis 

of  Variance 

Sum  of 

Source 

DF 

Squares  Mean  Square 

F  Ratio 

Model 

6 

1231454.1 

205242 

71.2899 

Error 

376 

1082497.7 
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Figure  B.3.  Scenario  3 


■  C-17  OAZI  Aa  DATA  SORTED  -  Fit  Least  Squares  -  JMP 


Response  ACTUAL  TIME  ON  GROUND  mins 


■^( Whole  Model 


^  Summary  of  Fit 


RSquare  0.305066 

RSquare  Adj  0.300464 

Root  Mean  Square  Error  54.99648 

Mean  of  Response  183.1422 

Obsenrations  (or  Sum  Wgts)  457 
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Figure  B.4.  Scenario  4 


■  C-130  OAK  ALL  DATA  SORTED  -  Rt  Least  Squares  -  JMP 


Response  ACTUAL  TIME  ON  GROUND  mins 
4 Whole  Model 


Summary  of  Fit 


RSquare  0.108679 

RSquareAdj  0.102413 

Root  Mean  Square  Error  33.92987 

Mean  of  Response  54.74042 

Observations  (or  Sum  Wgts)  574 
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Figure  B.5.  Scenario  5 


■  C-130  OAKN  ALL  DATA  SORTED  -  Fit  Least  Squares  -  JMP 


4  ^  Response  ACTUAL  TIME  ON  GROUND  mins 


Whole  Model 


^Summary  of  Fit 
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Figure  B.6.  Scenario  6 


■  C-130  OAZt  AIlL  data  SORTED  -  fit  Least  Squares  -  JMP 


4  ■>  Response  ACTUAL  TIME  ON  GROUND  mins 


RSquare  0.090991 

RSquare  Adj  0.080297 

Root  Mean  Square  Error  21 .49879 

Mean  of  Response  46.45087 

Obsen/ations  (or  Sum  Wgls)  173 
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Figure  B.7.  Scenario  7 


■  C-5  OAK  Aa  DATA  SORTED  -  Rt  Least  Squ»es  -  JMP 


4  ly.  Response  ACTUAL  TIME  ON  GROUND  mins _ 

Whole  Model  \A 

^  Actual  by  Predicted  Plot 
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Figure  B.8,  Scenario  8 

No  regressors  ineluded  in  stepwise  regression. 
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APPENDIX  C:  Square  Root  Transformation  on  Y  variable 

Standard  Least  Squares  Output  applied  with  the  square  root  transformation  on  Y  variable 


Figure  C.l.  Scenario  1 
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Figure  C.2.  Scenario  2 
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Response  Sqrt(ACTUAL  TIME  ON  GROUND  mins) 


RSquare 

0.512264 

RSquare  Adj 

0.504481 

Root  Mean  Square  Error 

1.862015 

Mean  of  Response 

13.58843 

Obsenrations  (or  Sum  Wgts) 

383 

^Analysis  of  Variance 

Sum  of 


Source 

OF 

Squares  Mean  Square 

F  Ratio 

Model 

6 

1369.1898 

228.198 

65.8182 

Error 

376 

1303.6291 

3.467 

Prob  >  F 

C.  Total 

382 

2672.8189 

<.0001* 

^  Lack  Of  Fit 

Sum  of 

F  Ratio 

Source 

DF 

Squares 

Mean  Square 

0.9817 

Lack  Of  Fit 

226 

777.7709 

3.44146 

Prob>  F 

Pure  Error 

150 

525.8582 

3.50572 

0.5534 

Total  Error 

376 

1303.6291 

Max  RSq 

0.8033 

<[  Parameter  Estimates 


Term 

Estimate 

Std  Error 

t  Ratio 

Prot»|t| 

Intercept 

12.82002 

0.165712 

77.36 

<0001* 

PC  off 

0.238186 

0.015175 

15.70 

<0001* 

T2  off 

0.2657309 

0.076526 

3.47 

0.0006’ 

PC  on 

-0.119009 

0.030152 

-3.95 

<0001* 

RS  on 

-0.196195 

0.044043 

-4.45 

<0001* 

T3  on 

-0.84792 

0.181911 

-4.66 

<0001* 

pax  offloaded 

-0.00664 

0.001887 

-3.52 

0.0005* 

>1  Effect  Tests 


78 


Figure  C.3.  Scenario  3 
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Figure  C.4.  Scenario  7 
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APPENDIX  D:  EXCEL  BASED  TACC  TOOL 


'  A  1  C  ^  D  i  E  i  F  i  G  ^  H  I  i  i  j  i  K 

9  Ground  time  predictor 

10  1)  select  aircraft  and  airfield  from  drop  down  box 

11  2)  input  equivalent  pallet  positions  and  number  of  passengers  (only  needed  types  for  prediction  are  included) 
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3)  ground  time  appears  in  minutes  and  hours 
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27 
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28 

T2off  =  Pallet  Train  of  2  offloaded;  lT2  =  lfor  model 

29 

PC  on  =  Palletized  cargo  onloaded 

30 

RS  on  =  Rolling  Stock  cargo  onloaded 

31 

T3  on  =  Pallet  Train  of  3  onloaded;  1T3  =  1  for  model 

32 

T6  on  =  Pallet  Train  of  6  onloaded;  1T6  =  1  for  model 

Pax  off  =  #  of  total  passengers  offloaded 
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APPENDIX  E:  Blue  Dart 


OPTIMIZING  GROUND  TIMES  FOR  AMC  AIRCRAFT  IN  AFGHANISTAN 

Air  Mobility  Command’s  (AMC)  airlift  assets  that  transit  airfields  in  Afghanistan 
are  given  only  a  small  variety  of  ground  times  in  order  to  aeeomplish  their  mission.  These 
ground  times  are  based  on  overarehing  eategories  of  missions  that  aireraft  execute,  such 
as  cargo  upload,  cargo  download,  passenger  upload,  passenger  download,  or  a 
combination  of  these.  The  current  mission  planning  system  uses  these  overarching 
categories  to  plan  ground  times  and  does  not  account  for  the  exact  amount  of  cargo  or 
passengers.  This  leads  to  longer  or  shorter  ground  times  than  planned.  In  order  to  increase 
stability  at  these  fields  and  better  account  for  the  number  of  sorties  that  can  be  planned 
into  Afghanistan,  a  method  to  calculate  optimal  or  near  optimal  ground  times  is  needed. 

This  research  creates  a  linear  regression  model  that  accounts  for  the  size  of  cargo 
upload,  cargo  download,  passenger  upload,  and  passenger  download  known  by  the 
mission  planner.  This  model  can  be  used  by  the  mission  planners  at  AMC’s  Tanker 
Airlift  Control  Center  (TACC)  to  increase  the  efficiency  at  which  they  plan  sorties.  Eight 
scenarios  are  analyzed  to  account  for  C-17,  C-130  and  C-5  missions  to  Bagram  AB, 
Kandahar  AB  and  Camp  Bastion  airfields  in  Afghanistan.  Three  of  the  scenario  models 
are  found  to  be  significant  and  are  validated  with  split  data  from  a  separate  months  worth 
of  data.  The  use  of  the  three  significant  models  will  increase  stability  in  AMC  planning 
and  efficiency.  This  occurs  by  reducing  early  and  late  times  by  an  average  of  seven 
minutes  per  mission.  This  increases  stability  planning  by  74.1  hours  per  month.  In  turn, 
our  overall  wartime  effectiveness  will  be  enhanced. 
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APPENDIX  F:  Summary  Chart 
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